Knowledge

Foundation model

Source 📝

486:) provide different tradeoffs between the costs of adaptation and the extent to which models are specialized. Some major facets to consider when adapting a foundation model are compute budget and data availability. Foundation models can be very large, up to trillions of parameters in size, so adapting the entirety of a foundation model can be computationally expensive. Therefore, developers sometimes adapt only the last neural layer or only the bias vectors to save time and space. For particularly niche applications, specific data may also not be available to adapt the foundation model sufficiently. In such circumstances, data must be manually labeled, which is costly and can demand expert knowledge. 528: 404:
which the model is able to predict the next token in a sequence. Image models are commonly trained with contrastive learning or diffusion training objectives. For contrastive learning, images are randomly augmented before being evaluated on the resulting similarity of the model's representations. For diffusion models, images are noised and the model learns to gradually de-noise via the objective. Multimodal training objectives also exist, with some separating images and text during training, while others examine them concurrently. In general, the training objectives for foundation models promote the learning of broadly useful representations of data.
421:
across new applications, ensuring adherence to data licenses, and maintaining data quality all become more difficult as data size grows. The specific demands of foundation models have only exacerbated such issues, as it remains the norm for large foundation models to use public web-scraped data. Foundation models include also search engines data and SEO meta tags data. Public web data remains a plentiful resource, but it also demands stringent moderation and data processing from foundation model developers before it can be successfully integrated into the training pipeline.
438:
challenge for many foundation model developers, one that has led to an increasing dilemma in the field. Larger models require greater compute power, but often at the cost of improved compute efficiency. Since training remains time-consuming and expensive, the tradeoff between compute power and compute efficiency has led only a few select companies to afford the production costs for large, state of the art foundation models. Some techniques like compression and distillation can make inference more affordable, but they fail to completely shore up this weakness.
107:' was too narrow given focus is not only language; 'self-supervised model' was too specific to the training objective; and 'pretrained model' suggested that the noteworthy action all happened after 'pretraining." The term "foundation model" was chosen over "foundational model" because "foundational" implies that these models provide fundamental principles in a way that "foundation" does not. After considering many terms, they settled on "foundation model" to emphasize the intended 4336: 3907: 143:(D, CA) defines a foundation model as "an artificial intelligence model trained on broad data, generally uses self supervision, generally contains at least 1,000,000,000 parameters, is applicable across a wide range of contexts, and exhibits, or could be easily modified to exhibit, high levels of performance at tasks that could pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters." 3887: 536:
and also the most exclusive resources. To train larger and more complex AI, a sufficient amount of compute is key. However, compute is consolidated in the hands of a few, select entities, which most foundation model developers depend on. As such, the foundation model pipeline is concentrated heavily around these providers. Compute is also costly; in 2023, AI companies spent more than 80% of total capital on compute resources.
512:
amounts of data and compute (also referred to as computational power). Due to foundation models' large development costs and inexpensive adaptation requirements, the AI landscape has shifted to a small subset of AI companies making foundation models for downstream adaptation. Thus, most foundation model companies outsource this step to specialized data providers (e.g. Scale AI, Surge) and compute providers (e.g.
170:
Eshoo's definition also specifies that foundation models must achieve a level of performance as to be a potential danger. In contrast, the E.U. definition includes mention of whether the model is designed for generality of output. Nonetheless, all definitions share that foundation models must be trained on a broad range of data with potential applications in many domains.
190:. Foundation models are noteworthy given the unprecedented resource investment, model and data size, and ultimately their scope of application when compared to previous forms of AI. The rise of foundation models constitutes a new paradigm in AI, where general-purpose models function as a reusable infrastructure, instead of bespoke and one-off task-specific models. 564:, users can query the model and receive responses, but cannot directly access the model itself. Comparatively, the model could be directly downloadable for users to access and modify. Both release strategies are often classified as an open release. The exact definition of an open release is disputed, but widely accepted requirements are provided by the 412:, or able to solve a broad set of downstream capabilities within the given domain. Lastly, foundation model training objectives should seek to scale well and be computationally efficient. With model size and compute power both being relevant constraints, a training objective must be able to overcome such bottlenecks. 352:
frontier models continue to adapt after deployment, it remains difficult to mitigate all harms that arise from already-deployed models. If a frontier model happens to be open-source or is released online, the model can also disseminate rapidly, further hampering regulators by creating a lack of accountability.
539:
Foundation models require a large amount of general data to power their capabilities. Early foundation models scraped from subsets of the internet to provide this data information. As the size and scope of foundation models grows, larger quantities of internet scraping becomes necessary, resulting in
679:
Nestor Maslej, Loredana Fattorini, Erik Brynjolfsson, John Etchemendy, Katrina Ligett, Terah Lyons, James Manyika, Helen Ngo, Juan Carlos Niebles, Vanessa Parli, Yoav Shoham, Russell Wald, Jack Clark, and Raymond Perrault, "The AI Index 2023 Annual Report," AI Index Steering Committee, Institute for
535:
The foundation model developer itself will then take the data and use the supplied compute to actually train the foundation model. After the foundation model is completely built, much of the data and labor requirements abate. In this development process, hardware and compute are the most necessary,
466:
Foundation models are inherently multi-purpose: to use these model for a specific use case requires some form of adaptation. At a minimum, models need to be adapted to perform the task of interest (task specification), but often better performance can be achieved by more extensive adaptation to the
372:
Government agencies like EU Parliament have identified regulation general-purpose AI, such as foundation models, to be a high priority. General-purpose AI systems are often characterized by large size, opacity, and potential for emergence, all of which can create unintended harms. Such systems also
351:
Due to frontier models' unique capabilities, it is difficult to effectively regulate their development and deployment. Because of their emergent nature, new dangerous capabilities can appear on their own in frontier models, both in the development stage and after being deployed. Additionally, since
214:
Large Scale Visual Recognition Challenge. AlexNet exhibited strong performance on a large-scale general dataset, and first proved that deep learning was possible. Alongside the methodological shift to end-to-end optimization of deep neural networks, the 2010s was also marked by a software shift. In
193:
Foundation models draw upon a series of advances in the history of AI. These models can be situated against the backdrop of the broader rise of machine learning since the 1990s. Prior AI models depended on specific instructions to solve a given task, but machine learning-powered models were able to
54:
that can support a diverse range of use cases. Building foundation models is often highly resource-intensive, with the most expensive models costing hundreds of millions of dollars to pay for the underlying data and compute required. In contrast, adapting an existing foundation model for a specific
547:
The foundation model will then be hosted online either via the developer or via an external organization. Once released, other parties can create applications based on the foundation model, whether through fine-tuning or wholly new purposes. People can then access these applications to serve their
437:
GPUs are the most common choice of compute hardware for machine learning, due to high memory storage and strong power. Typical foundation model training requires many GPUs, all connected in parallel with fast interconnects. Acquiring a sufficient amount of GPUs of requisite compute efficiency is a
424:
Training foundation models often runs the risk of violating user privacy, as private data can be disclosed, collected, or used in ways beyond the stated scope. Even if no private data is leaked, models can still inadvertently compromise security through learned behavior in the resulting foundation
403:
Foundation models are built by optimizing a training objective(s), which is a mathematical function that determines how model parameters are updated based on model predictions on training data. Language models are often trained with a next-tokens prediction objective, which refers to the extent at
543:
To address this issue of low-quality data that arose with unsupervised training, some foundation model developers have turned to manual filtering. This practice, known as data labor, comes with its own host of issues. Such manual data detoxification is often outsourced to reduce labor costs, with
329:
Certain highly advanced foundation models are termed "frontier models," which have the potential to "possess dangerous capabilities sufficient to pose severe risks to public safety." These "dangerous capabilities" stem from the accidental or intentional misuse of such models, which in conjunction
169:
Overall, while many of these definitions stick close to the original Stanford definition, they do introduce some subtle distinctions. For example, the U.S. definitions are the sole definitions to make reference to the size of a foundation model, though they differ on an exact magnitude. Beyer and
556:
After a foundation model is built, it can be released in one of many ways. There are many facets to a release: the asset itself, who has access, how access changes over time, and the conditions on use. All these factors contribute to how a foundation model will affect downstream applications. In
511:
Foundation models' general capabilities allow them to fulfill a unique role in the AI ecosystem, fueled by many upstream and downstream technologies. Training a foundation model requires several resources (e.g. data, compute, labor, hardware, code), with foundation models often involving immense
502:
Since foundation models' utility depends on their own general capabilities and the performance of fine-tuned applications, evaluation must cover both metrics. Proper evaluation examines both a foundation model's downstream applications in aggregate and the direct properties the foundation model
433:
The size of foundation models also brings about issues with the computer systems they run on. The average foundation model is too large to be run within a single accelerator's memory and the initial training process requires an expensive amount of resources. Such issues are predicted to further
420:
Foundation models are trained on a large quantity of data, working under the maxim "the more data, the better." Performance evaluation does show that more data generally leads to better performance, but other issues arise as data quantity grows. Tasks like managing the dataset, integrating data
494:
Evaluation is a key part of developing foundation models. Not only does evaluation allow for tracking progress of high-performance models, it also creates benchmarks for future model development. Stakeholders rely on evaluations to understand model behaviors and gain insight into their various
360:
Due to their adaptability to a wide range of use-cases, foundation models are sometimes considered to be examples of general-purpose AI. In designing the EU AI Act, the European Parliament has stated that a new wave of general-purpose AI technologies shapes the overall AI ecosystem. The fuller
407:
With the rise of foundation models and the larger datasets that power them, a training objective must be able to parse through internet-scale data for meaningful data points. Additionally, since foundation models are designed to solve a general range of tasks, training objectives ought to be
102:
The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) coined the term "foundation model" in August 2021 to mean "any model that is trained on broad data (generally using self-supervision at scale) that can be adapted (e.g.,
446:
The accuracy and capabilities of foundation models often scale predictably with the size of the model and the amount of the training data. Specifically, scaling laws have been discovered, which are data-based empirical trends that relate resources (data, model size, compute usage) to model
238:. Relative to most prior work on deep learning, these language models demonstrated the potential of training on much large web-sourced datasets using self-supervised objectives (e.g. predicting the next word in a large corpus of text). These approaches, which draw upon earlier works like 587:. While open foundation models can further research and development more easily, they are also more susceptible to misuse. Open foundation models can be downloaded by anyone, and particularly powerful models can be fine-tuned to intentionally or unintentionally cause harm. 425:
model. Data quality is another key point, as web-scraped data frequently contains biased, duplicate, and toxic material. Once foundation models are deployed, ensuring high-quality data is still an issue, as undesirable behavior can still emerge from small subsets of data.
205:
The next major step was the advent of deep learning circa 2010. With larger datasets and more advanced neural networks, AI models were able to achieve higher levels of performance. The first major instance of deep learning was exhibited by the model architecture
499:, MMMU, HumanEval, and GSM8K. Given that foundation models are multi-purpose, increasingly meta-benchmarks are developed that aggregate different underlying benchmarks. Examples include LM-Harness, BIG-Bench, HELM, OpenLLM Leaderboard, DecodingTrust, and HEIM. 390:
For a foundation model to effectively generalize, it must acquire rich representations of the training data. As a result, expressive model architectures that efficiently process large-scale data are often preferred in building foundation models. Currently, the
361:
structure of the ecosystem, in addition to the properties of specific general-purpose AI systems, influences the design of AI policy and research. General-purpose AI systems also often appear in people's everyday lives through applications and tools like
1235:
Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; Mahdavi, S. Sara; Wei, Jason; Chung, Hyung Won; Scales, Nathan; Tanwani, Ajay; Cole-Lewis, Heather; Pfohl, Stephen; Payne, Perry; Seneviratne, Martin; Gamble, Paul; Kelly, Chris; Babiker, Abubakr (August 2023).
824:
Zvyagin, Maxim; Brace, Alexander; Hippe, Kyle; Deng, Yuntian; Zhang, Bin; Bohorquez, Cindy Orozco; Clyde, Austin; Kale, Bharat; Perez-Rivera, Danilo (11 October 2022). "GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics".
333:
Since the concept of dangerous capabilities is inherently subjective, there is no strict designation for what foundation models qualify as frontier models. However, some generally held ideas for sufficiently dangerous capabilities include:
590:
During a closed release, the foundation model cannot be accessed by the public, but is used internally by an organization. Such releases are considered safer, but offer no additional value to the research community or the public at large.
781:
Nguyen, Tuan Dung; Ting, Yuan-Sen; Ciucă, Ioana; O'Neill, Charlie; Sun, Ze-Chang; Jabłońska, Maja; Kruk, Sandor; Perkowski, Ernest; Miller, Jack (12 September 2023). "AstroLLaMA: Towards Specialized Foundation Models in Astronomy".
919:
Azerbayev, Zhangir; Schoelkopf, Hailey; Paster, Keiran; Santos, Marco Dos; McAleer, Stephen; Jiang, Albert Q.; Deng, Jia; Biderman, Stella; Welleck, Sean (30 November 2023). "Llemma: An Open Language Model For Mathematics".
458:) from a power law with one exponent to a power law with another (different) exponent. When one does not collect any points near (or after) the break(s), it can be difficult to obtain an accurate extrapolation. 330:
with their powerful nature can lead to severe harms. As foundation models continue to improve, some AI researchers speculate that almost all next-generation foundation models will be considered frontier models.
1055: 1100:
Liang, Percy; Bommasani, Rishi; Lee, Tony; Tsipras, Dimitris; Soylu, Dilara; Yasunaga, Michihiro; Zhang, Yian; Narayanan, Deepak; Wu, Yuhuai (1 October 2023), "Holistic Evaluation of Language Models",
503:
holds. To ensure further equity in evaluation, certain existing evaluation frameworks account for all adaptation resources, which leads to more informed analyses for the benefit of all stakeholders.
872:
Li, Raymond; Allal, Loubna Ben; Zi, Yangtian; Muennighoff, Niklas; Kocetkov, Denis; Mou, Chenghao; Marone, Marc; Akiki, Christopher; Li, Jia (9 May 2023). "StarCoder: may the source be with you!".
3953: 803:
Tu, Tao; Azizi, Shekoofeh; Driess, Danny; Schaekermann, Mike; Amin, Mohamed; Chang, Pi-Chuan; Carroll, Andrew; Lau, Chuck; Tanno, Ryutaro (26 July 2023). "Towards Generalist Biomedical AI".
90:
for music, and RT-2 for robotic control. Foundation models constitute a broad shift in AI development: foundation models are being built for astronomy, radiology, genomics, music, coding,
447:
capabilities. Particularly, a model's scale is defined by compute, dataset size, and the number of parameters, all of which exhibit a power-law relationship with end performance.
154:
defines a foundation model as an "AI model that is trained on broad data at scale, is designed for generality of output, and can be adapted to a wide range of distinctive tasks".
734:
Copet, Jade; Kreuk, Felix; Gat, Itai; Remez, Tal; Kant, David; Synnaeve, Gabriel; Adi, Yossi; Défossez, Alexandre (7 November 2023). "Simple and Controllable Music Generation".
434:
exacerbate in future as foundation models grow to new heights. Due to this constraint, researchers have begun looking into compressing model size through tight model inference.
373:
heavily influence downstream applications, which further exacerbates the need for regulation. In regards to prominent legislation, a number of stakeholders have pushed for the
4339: 3946: 3781: 4303: 1616:
Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; Sastry, Girish (22 July 2020),
1488:
Radford, Alec; Kim, Jong Wook; Hallacy, Chris; Ramesh, Aditya; Goh, Gabriel; Agarwal, Sandhini; Sastry, Girish; Askell, Amanda; Mishkin, Pamela (26 February 2021),
1799:
Srivastava, Aarohi; Rastogi, Abhinav; Rao, Abhishek; Shoeb, Abu Awal Md; Abid, Abubakar; Fisch, Adam; Brown, Adam R.; Santoro, Adam; Gupta, Aditya (12 June 2023),
103:
fine-tuned) to a wide range of downstream tasks". This was based on their observation that preexisting terms, while overlapping, were not adequate, stating that "'
2226:
Alayrac, Jean-Baptiste; Donahue, Jeff; Luc, Pauline; Miech, Antoine; Barr, Iain; Hasson, Yana; Lenc, Karel; Mensch, Arthur; Millican, Katie (15 November 2022),
257:), and the increased use of training data with minimal supervision all contributed to the rise of foundation models. Some noteworthy foundation models include: 3939: 47:. The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) created and popularized the term. 4020: 3962: 2268: 316:
in 2023 contributed to a greater emphasis placed on how foundation models are released with open foundation models garnering a lot of support and scrutiny.
165:
defines foundations model as "a type of AI technology that are trained on vast amounts of data that can be adapted to a wide range of tasks and operations."
2428: 540:
higher likelihoods of biased or toxic data. This toxic or biased data can disproportionately harm marginalized groups and exacerbate existing prejudices.
4091: 1512:
Kaplan, Jared; McCandlish, Sam; Henighan, Tom; Brown, Tom B.; Chess, Benjamin; Child, Rewon; Gray, Scott; Radford, Alec; Wu, Jeffrey (22 January 2020),
3029: 4131: 3623: 2104: 1178: 4126: 1442:
Bommasani, Rishi; Klyman, Kevin; Longpre, Shayne; Kapoor, Sayash; Maslej, Nestor; Xiong, Betty; Zhang, Daniel; Liang, Percy (19 October 2023),
308:(initially powered by the GPT-3.5 model) led to foundation models and generative AI entering widespread public discourse. Further, releases of 1702:
Yue, Xiang; Ni, Yuansheng; Zhang, Kai; Zheng, Tianyu; Liu, Ruoqi; Zhang, Ge; Stevens, Samuel; Jiang, Dongfu; Ren, Weiming (20 December 2023),
4310: 4290: 2406: 940: 1536:
Jo, Eun Seo; Gebru, Timnit (27 January 2020). "Lessons from archives: Strategies for collecting sociocultural data in machine learning".
87: 2817: 2261: 112: 35:
model that is trained on broad data such that it can be applied across a wide range of use cases. Foundation models have transformed
4096: 392: 4136: 3139: 2986: 2060: 4368: 3022: 1600: 1563: 1080: 711: 495:
attributes. Traditionally, foundation models are evaluated relative to each other through standardized task benchmarks like
4358: 3812: 2727: 2418: 2254: 2129: 642: 254: 3913: 3464: 3201: 2981: 847: 4363: 4278: 2588: 40: 4015: 3725: 3352: 3159: 3015: 2742: 2573: 226:
Foundation models began to materialize as the latest wave of deep learning models in the late 2010s with models like
158: 1467: 4025: 3680: 2513: 654: 690:
Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna (2020). "A Primer in BERTology: What we know about how BERT works".
3980: 2930: 2583: 297:. Each of these models came with its own unique abilities, particularly in their strong generative capabilities. 4378: 4086: 3867: 3807: 3405: 2578: 2323: 557:
particular, the two most common forms of foundation model release are through APIs and direct model downloads.
955:
Bommasani, Rishi; et al. (18 August 2021). On the Opportunities and Risks of Foundation Models (Report).
3400: 3089: 2847: 2568: 1677: 1393: 755: 527: 4317: 4177: 4106: 3842: 3239: 3196: 3149: 3144: 2540: 377:
to include restrictions on general-purpose AI systems, all of which would also apply to foundation models.
4323: 3893: 3189: 3115: 2885: 2870: 2842: 2707: 2702: 2277: 1211:; O'Keefe, Cullen; Whittlestone, Jess; Avin, Shahar; Brundage, Miles; Bullock, Justin (7 November 2023), 614:'s Llama 2 are open, with broadly available model weights enabling downstream modification and scrutiny. 479: 374: 151: 135:
In the United States, the proposed AI Foundation Model Transparency Act of 2023 by House Representatives
4373: 3517: 3452: 3053: 2622: 2593: 2371: 941:
https://www.orbitalmaterials.com/post/technical-blog-introducing-the-orb-ai-based-interatomic-potential
607: 51: 598:'s Flamingo are fully closed, meaning they are available only to the model developer; others, such as 300:
In particular, 2022 was particularly influential in the history of foundation models. The releases of
246:, deviated from prior supervised approaches that required annotated data (e.g. crowd-sources labels). 4076: 4060: 4010: 3918: 3776: 3415: 3246: 3069: 2465: 2318: 1056:"Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence" 455: 451: 249:
Overall, the computational advances in specialized hardware and parallelism (e.g., large clusters of
1418:
Bommasani, Rishi; Soylu, Dilara; Liao, Thomas I.; Creel, Kathleen A.; Liang, Percy (28 March 2023),
1311:
Nori, Harsha; King, Nicholas; McKinney, Scott Mayer; Carignan, Dean; Horvitz, Eric (12 April 2023),
4111: 4030: 3817: 3074: 2991: 2915: 2647: 2603: 2488: 2386: 187: 132:; contains at least tens of billions of parameters; is applicable across a wide range of contexts". 129: 125:
Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence
1579:
Bender, Emily M.; Gebru, Timnit; McMillan-Major, Angelina; Shmitchell, Shmargaret (1 March 2021).
3966: 3862: 3847: 3500: 3495: 3395: 3263: 3044: 2895: 2865: 2532: 395:
architecture is the de facto choice for building foundation models across a range of modalities.
36: 3931: 2366: 1924:
Linzen, Tal (July 2020). Jurafsky, Dan; Chai, Joyce; Schluter, Natalie; Tetreault, Joel (eds.).
1751: 1704:
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
1179:"Hawley and Blumenthal Demand Answers from Meta, Warn of Misuse After 'Leak' of Meta's AI Model" 178:
Technologically, foundation models are built using established machine learning techniques like
4383: 4035: 3822: 3582: 3301: 3296: 2752: 2445: 2423: 2413: 2381: 2356: 2078: 1726: 565: 454:
have been discovered in which this relationship smoothly transitions (at points referred to as
976: 3990: 3852: 3837: 3802: 3490: 3390: 3258: 2612: 1208: 262: 124: 79: 3720: 2181: 1801:
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
82:. Beyond text, foundation models have been developed across a range of modalities—including 4081: 3872: 3827: 3273: 3218: 3064: 3059: 2965: 2641: 2617: 2470: 1655:
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
1259: 1119: 826: 341:
Producing and propagating convincing, tailored disinformation with minimal user instruction
179: 104: 8: 4296: 3447: 3425: 3174: 3169: 3127: 3079: 2945: 2875: 2832: 2788: 2560: 2550: 2545: 2433: 2154: 1029: 513: 475: 270: 147: 128:
defines a foundation model as "an AI model that is trained on broad data; generally uses
1288: 1263: 1237: 1123: 1003: 643:
https://assets.publishing.service.gov.uk/media/65081d3aa41cc300145612c0/Full_report_.pdf
3832: 3410: 2955: 2827: 2692: 2455: 2438: 2296: 2231: 2207: 1991: 1933: 1930:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
1804: 1707: 1658: 1621: 1587:. FAccT '21. New York, NY, USA: Association for Computing Machinery. pp. 610–623. 1541: 1517: 1493: 1447: 1423: 1375: 1316: 1249: 1216: 1109: 956: 921: 873: 804: 783: 735: 691: 471: 1776: 756:"Speaking robot: Our new AI model translates vision and language into robotic actions" 4262: 4237: 4055: 3898: 3886: 3690: 3342: 3213: 3206: 2960: 2672: 2480: 2391: 1596: 1559: 1379: 1337:"Access to A.I. Justice: Avoiding an Inequitable Two-Tiered System of Legal Services" 1293: 1275: 1135: 531:
Investment in computing capabilities to train larger AI models has rapidly increased.
223:
provided crucial infrastructure for simplifying and scaling deep learning pipelines.
183: 1947: 1585:
Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency
4284: 4257: 4157: 4005: 3643: 3633: 3440: 3234: 3184: 3179: 3122: 3110: 2837: 2722: 2697: 2498: 2401: 1943: 1588: 1551: 1367: 1283: 1267: 1127: 301: 282: 199: 28: 1355: 4222: 4202: 4192: 4182: 4116: 4050: 3756: 3700: 3522: 3164: 3084: 2949: 2910: 2905: 2773: 2503: 2376: 2351: 2333: 2182:"The Time is Now to Develop Community Norms for the Release of Foundation Models" 1580: 595: 548:
various means, allowing one foundation model to power and reach a wide audience.
521: 59: 1538:
Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency
4045: 3730: 3695: 3685: 3510: 3268: 3094: 2657: 2637: 2361: 2199: 1271: 611: 194:
decipher what task to solve given sufficient data. Such a shift from so-called
118:
As governments regulate foundation models, new legal definitions have emerged.
2246: 1961: 1926:"How Can We Accelerate Progress Towards Human-like Linguistic Generalization?" 1848: 1823: 831: 4352: 4252: 4197: 4167: 3675: 3655: 3572: 3251: 2920: 2732: 2712: 2493: 1925: 1640: 1279: 195: 32: 1592: 1555: 1394:"General-purpose artificial intelligence | Think Tank | European Parliament" 895:"Revolutionizing Time Series Forecasting: Interview with TimeGPT's creators" 4242: 4172: 3985: 3761: 3592: 3007: 2900: 2518: 1297: 1139: 517: 1581:"On the Dangers of Stochastic Parrots: Can Language Models be Too Big? 🦜" 4247: 4232: 4040: 4000: 3857: 3628: 3537: 3532: 3154: 3132: 2857: 2737: 2450: 2343: 2291: 1899: 1371: 580: 91: 4212: 4187: 4162: 4121: 4101: 3751: 3710: 3705: 3618: 3527: 3435: 3347: 3327: 2460: 1678:"Papers with Code - MMLU Benchmark (Multi-task Language Understanding)" 1131: 894: 584: 313: 294: 220: 140: 1653:
Zaken, Elad Ben; Ravfogel, Shauli; Goldberg, Yoav (5 September 2022),
1639:
Caballero, Ethan; Gupta, Kshitij; Rish, Irina; Krueger, David (2022).
4227: 4217: 3995: 3746: 3715: 3613: 3457: 3420: 3357: 3311: 3306: 3291: 2328: 1490:
Learning Transferable Visual Models From Natural Language Supervision
136: 1643:. International Conference on Learning Representations (ICLR), 2023. 1336: 4207: 3648: 3480: 2803: 2783: 2768: 2747: 2717: 2662: 2627: 2508: 2236: 2212: 1996: 1938: 1874: 1809: 1712: 1663: 1626: 1578: 1546: 1522: 1498: 1452: 1428: 1321: 1254: 1221: 1152: 1114: 961: 926: 878: 809: 788: 740: 696: 239: 211: 253:
GPUs), new developments in neural network architecture (e.g., the
111:(i.e., amenability to subsequent further development) rather than 3961: 3771: 3608: 3562: 3485: 3385: 3380: 3332: 2940: 2798: 2778: 2652: 2396: 2311: 2204:
The Gradient of Generative AI Release: Methods and Considerations
2105:"These fake images reveal how AI amplifies our worst stereotypes" 680:
Human-Centered AI, Stanford University, Stanford, CA, April 2023.
655:"Introducing the Center for Research on Foundation Models (CRFM)" 362: 305: 216: 207: 44: 1932:. Online: Association for Computational Linguistics: 5210–5217. 1213:
Frontier AI Regulation: Managing Emerging Risks to Public Safety
4141: 3786: 3766: 3638: 3430: 2306: 2301: 2180:
Liang, Percy; Bommasani, Rishi; Creel, Kathleen (17 May 2022).
918: 599: 366: 278: 250: 83: 75: 63: 1206: 3587: 3567: 3557: 3552: 3547: 3542: 3505: 3337: 2996: 2632: 1511: 603: 576: 338:
Designing and synthesizing new biological or chemical weapons
309: 290: 286: 274: 266: 258: 243: 235: 231: 2155:"Exclusive: The $ 2 Per Hour Workers Who Made ChatGPT Safer" 1441: 606:, are limited access, available to the public but only as a 3577: 2035: 1752:"Papers with Code - GSM8K Benchmark (Arithmetic Reasoning)" 1420:
Ecosystem Graphs: The Social Footprint of Foundation Models
713:
Tackling multiple tasks with a single visual language model
572: 496: 483: 227: 2011:"Accelerate the Development of AI Applications | Scale AI" 1727:"Papers with Code - HumanEval Benchmark (Code Generation)" 954: 780: 2793: 2036:"Surge AI | World's Most Powerful Data Labeling Platform" 1798: 802: 561: 215:
the mid 2010s, the rise of deep learning frameworks like
1310: 1099: 202:
was the first step towards the modern foundation model.
2228:
Flamingo: a Visual Language Model for Few-Shot Learning
1487: 1417: 689: 2225: 1988:
Market Concentration Implications of Foundation Models
1207:
Anderljung, Markus; Barnhart, Joslyn; Korinek, Anton;
55:
use case or using it directly is much less expensive.
4021:
Existential risk from artificial general intelligence
2010: 1652: 1615: 1234: 950: 948: 823: 344:
Harnessing unprecedented offensive cyber capabilities
871: 4092:
Center for Human-Compatible Artificial Intelligence
2179: 1313:
Capabilities of GPT-4 on Medical Challenge Problems
1002:Bommasani, Rishi; Liang, Percy (18 October 2021). 945: 733: 1238:"Large language models encode clinical knowledge" 4350: 4132:Leverhulme Centre for the Future of Intelligence 2479: 848:"LLark: A Multimodal Foundation Model for Music" 704: 2276: 1824:"Holistic Evaluation of Language Models (HELM)" 1701: 4127:Institute for Ethics and Emerging Technologies 2130:"How the AI industry profits from catastrophe" 1986:Vipra, Jai; Korinek, Anton (2 November 2023), 1354:Arbel, Yonathan A.; Becher, Shmuel I. (2020). 1001: 4311:Superintelligence: Paths, Dangers, Strategies 4291:Open letter on artificial intelligence (2015) 3947: 3023: 2262: 1849:"open-llm-leaderboard (Open LLM Leaderboard)" 1465: 347:Evading human control through deceptive means 3037: 2103:Tiku, Nitasha; Schaul, Kevin; Chen, Szu Yu. 1985: 1900:"Holistic Evaluation of Image Models (HEIM)" 544:some workers making less than $ 2 per hour. 467:domain of interest (domain specialization). 1353: 1153:"Joint Statement on AI Safety and Openness" 845: 3954: 3940: 3030: 3016: 2269: 2255: 2102: 1102:Annals of the New York Academy of Sciences 637:Competition and Markets Authority (2023). 2235: 2211: 1995: 1937: 1808: 1711: 1662: 1625: 1545: 1521: 1497: 1451: 1427: 1320: 1287: 1253: 1220: 1113: 960: 925: 893:Se, Ksenia; Spektor, Ian (5 April 2024). 892: 877: 830: 808: 787: 739: 695: 94:forecasting, mathematics, and chemistry. 4097:Centre for the Study of Existential Risk 2198: 1962:"Ecosystem Graphs for Foundation Models" 1468:"A Mathematical Theory of Communication" 846:Engineering, Spotify (13 October 2023). 526: 58:Early examples of foundation models are 4137:Machine Intelligence Research Institute 1535: 1514:Scaling Laws for Neural Language Models 1444:The Foundation Model Transparency Index 1356:"Contracts in the Age of Smart Readers" 1334: 4351: 1923: 1081:"AI Foundation Model Transparency Act" 1027: 16:Artificial intelligence model paradigm 3935: 3011: 2250: 1618:Language Models are Few-Shot Learners 1202: 1200: 1095: 1093: 1053: 551: 355: 3868:Generative adversarial network (GAN) 2728:Simple Knowledge Organization System 1466:Claude Elwood, Shannon (July 1948). 1054:House, The White (30 October 2023). 639:AI Foundation Models: Initial Report 633: 631: 629: 627: 380: 163:AI Foundation Models: Initial Report 2076: 647: 319: 115:, architecture, or implementation. 13: 4279:Statement on AI risk of extinction 1197: 1090: 1028:Marcus, Gary (11 September 2021). 1004:"Reflections on Foundation Models" 977:"Reflections on Foundation Models" 324: 14: 4395: 4016:Ethics of artificial intelligence 2743:Thesaurus (information retrieval) 624: 571:Some open foundation models are: 159:Competition and Markets Authority 4335: 4334: 4026:Friendly artificial intelligence 3906: 3905: 3885: 2066:. 15 April 2024. pp. 37–39. 1778:EleutherAI/lm-evaluation-harness 1030:"Has AI found a new Foundation?" 560:When a model is released via an 2219: 2192: 2173: 2147: 2122: 2096: 2070: 2053: 2028: 2003: 1979: 1954: 1917: 1892: 1867: 1841: 1816: 1792: 1769: 1744: 1719: 1695: 1670: 1646: 1633: 1609: 1572: 1529: 1505: 1481: 1459: 1435: 1411: 1386: 1347: 1335:Simshaw, Drew (22 April 2022). 1328: 1304: 1228: 1171: 1145: 1073: 1047: 1021: 995: 969: 934: 912: 886: 865: 839: 817: 506: 4087:Center for Applied Rationality 3818:Recurrent neural network (RNN) 3808:Differentiable neural computer 2324:Natural language understanding 796: 774: 748: 727: 683: 673: 150:'s negotiated position on the 97: 1: 4369:Computational fields of study 3863:Variational autoencoder (VAE) 3823:Long short-term memory (LSTM) 3090:Computational learning theory 2848:Optical character recognition 1948:10.18653/v1/2020.acl-main.465 1475:Bell System Technical Journal 617: 489: 461: 4107:Future of Humanity Institute 3843:Convolutional neural network 2541:Multi-document summarization 2079:"Computational Power and AI" 1641:"Broken Neural Scaling Laws" 610:; and still others, such as 594:Some foundation models like 52:general-purpose technologies 7: 4359:Natural language processing 4324:Artificial Intelligence Act 4318:Do You Trust This Computer? 3838:Multilayer perceptron (MLP) 2871:Latent Dirichlet allocation 2843:Natural language generation 2708:Machine-readable dictionary 2703:Linguistic Linked Open Data 2278:Natural language processing 2061:"2024 AI Index - chapter 1" 1781:, EleutherAI, 21 April 2024 470:A variety of methods (e.g. 398: 385: 157:In the United Kingdom, the 146:In the European Union, the 10: 4400: 3914:Artificial neural networks 3828:Gated recurrent unit (GRU) 3054:Differentiable programming 2623:Explicit semantic analysis 2372:Deep linguistic processing 1272:10.1038/s41586-023-06291-2 441: 428: 173: 122:In the United States, the 4364:Computational linguistics 4332: 4271: 4150: 4077:Alignment Research Center 4069: 4061:Technological singularity 4011:Effective accelerationism 3973: 3881: 3795: 3739: 3668: 3601: 3473: 3373: 3366: 3320: 3284: 3247:Artificial neural network 3227: 3103: 3070:Automatic differentiation 3043: 2974: 2929: 2884: 2856: 2816: 2761: 2683: 2671: 2602: 2559: 2531: 2466:Word-sense disambiguation 2342: 2319:Computational linguistics 2284: 2077:pnp (27 September 2023). 1875:"DecodingTrust Benchmark" 832:10.1101/2022.10.10.511571 86:and Flamingo for images, 39:(AI), powering prominent 4112:Future of Life Institute 4031:Instrumental convergence 3075:Neuromorphic engineering 3038:Differentiable computing 2992:Natural Language Toolkit 2916:Pronunciation assessment 2818:Automatic identification 2648:Latent semantic analysis 2604:Distributional semantics 2489:Compound-term processing 2387:Named-entity recognition 188:self-supervised learning 3967:artificial intelligence 3848:Residual neural network 3264:Artificial Intelligence 2896:Automated essay scoring 2866:Document classification 2533:Automatic summarization 1879:decodingtrust.github.io 1593:10.1145/3442188.3445922 1556:10.1145/3351095.3372829 1341:SSRN Electronic Journal 415: 37:artificial intelligence 4036:Intelligence explosion 2753:Universal Dependencies 2446:Terminology extraction 2429:Semantic decomposition 2424:Semantic role labeling 2414:Part-of-speech tagging 2382:Information extraction 2367:Coreference resolution 2357:Collocation extraction 1398:www.europarl.europa.eu 566:Open Source Initiative 532: 105:(large) language model 50:Foundation models are 4379:Unsupervised learning 3991:AI capability control 3803:Neural Turing machine 3391:Human image synthesis 2514:Sentence segmentation 2134:MIT Technology Review 530: 210:, which won the 2012 4082:Center for AI Safety 3894:Computer programming 3873:Graph neural network 3448:Text-to-video models 3426:Text-to-image models 3274:Large language model 3259:Scientific computing 3065:Statistical manifold 3060:Information geometry 2966:Voice user interface 2677:datasets and corpora 2618:Document-term matrix 2471:Word-sense induction 1540:. pp. 306–316. 1372:10.2139/ssrn.3740356 180:deep neural networks 4297:Our Final Invention 3240:In-context learning 3080:Pattern recognition 2946:Interactive fiction 2876:Pachinko allocation 2833:Speech segmentation 2789:Google Ngram Viewer 2561:Machine translation 2551:Text simplification 2546:Sentence extraction 2434:Semantic similarity 2202:(5 February 2023), 1264:2023Natur.620..172S 1183:Senator Josh Hawley 1124:2023NYASA1525..140B 514:Amazon Web Services 476:in-context learning 452:broken scaling laws 148:European Parliament 3833:Echo state network 3721:Jürgen Schmidhuber 3416:Facial recognition 3411:Speech recognition 3321:Software libraries 2956:Question answering 2828:Speech recognition 2693:Corpus linguistics 2673:Language resources 2456:Textual entailment 2439:Sentiment analysis 1756:paperswithcode.com 1731:paperswithcode.com 1682:paperswithcode.com 1132:10.1111/nyas.15007 552:Release strategies 533: 356:General-purpose AI 43:applications like 4374:Language modeling 4346: 4345: 4263:Eliezer Yudkowsky 4238:Stuart J. Russell 4056:Superintelligence 3929: 3928: 3691:Stephen Grossberg 3664: 3663: 3005: 3004: 2961:Virtual assistant 2886:Computer-assisted 2812: 2811: 2569:Computer-assisted 2527: 2526: 2519:Word segmentation 2481:Text segmentation 2419:Semantic analysis 2407:Syntactic parsing 2392:Ontology learning 2161:. 18 January 2023 1966:crfm.stanford.edu 1904:crfm.stanford.edu 1855:. 9 November 2023 1828:crfm.stanford.edu 1602:978-1-4503-8309-7 1565:978-1-4503-6936-7 1360:Geo. Wash. L. Rev 1248:(7972): 172–180. 1159:. 31 October 2023 983:. 18 October 2021 381:Technical details 184:transfer learning 4391: 4338: 4337: 4285:Human Compatible 4258:Roman Yampolskiy 4006:Consequentialism 3963:Existential risk 3956: 3949: 3942: 3933: 3932: 3919:Machine learning 3909: 3908: 3889: 3644:Action selection 3634:Self-driving car 3441:Stable Diffusion 3406:Speech synthesis 3371: 3370: 3235:Machine learning 3111:Gradient descent 3032: 3025: 3018: 3009: 3008: 2982:Formal semantics 2931:Natural language 2838:Speech synthesis 2820:and data capture 2723:Semantic network 2698:Lexical resource 2681: 2680: 2499:Lexical analysis 2477: 2476: 2402:Semantic parsing 2271: 2264: 2257: 2248: 2247: 2241: 2240: 2239: 2223: 2217: 2216: 2215: 2196: 2190: 2189: 2177: 2171: 2170: 2168: 2166: 2151: 2145: 2144: 2142: 2140: 2126: 2120: 2119: 2117: 2115: 2100: 2094: 2093: 2091: 2089: 2083:AI Now Institute 2074: 2068: 2067: 2065: 2057: 2051: 2050: 2048: 2046: 2032: 2026: 2025: 2023: 2021: 2007: 2001: 2000: 1999: 1983: 1977: 1976: 1974: 1972: 1958: 1952: 1951: 1941: 1921: 1915: 1914: 1912: 1910: 1896: 1890: 1889: 1887: 1885: 1871: 1865: 1864: 1862: 1860: 1845: 1839: 1838: 1836: 1834: 1820: 1814: 1813: 1812: 1796: 1790: 1789: 1788: 1786: 1773: 1767: 1766: 1764: 1762: 1748: 1742: 1741: 1739: 1737: 1723: 1717: 1716: 1715: 1699: 1693: 1692: 1690: 1688: 1674: 1668: 1667: 1666: 1650: 1644: 1637: 1631: 1630: 1629: 1613: 1607: 1606: 1576: 1570: 1569: 1549: 1533: 1527: 1526: 1525: 1509: 1503: 1502: 1501: 1485: 1479: 1478: 1472: 1463: 1457: 1456: 1455: 1439: 1433: 1432: 1431: 1415: 1409: 1408: 1406: 1404: 1390: 1384: 1383: 1351: 1345: 1344: 1332: 1326: 1325: 1324: 1308: 1302: 1301: 1291: 1257: 1232: 1226: 1225: 1224: 1204: 1195: 1194: 1192: 1190: 1175: 1169: 1168: 1166: 1164: 1149: 1143: 1142: 1117: 1097: 1088: 1087: 1085: 1077: 1071: 1070: 1068: 1066: 1051: 1045: 1044: 1042: 1040: 1025: 1019: 1018: 1016: 1014: 999: 993: 992: 990: 988: 973: 967: 966: 964: 952: 943: 938: 932: 931: 929: 916: 910: 909: 907: 905: 890: 884: 883: 881: 869: 863: 862: 860: 858: 852:Spotify Research 843: 837: 836: 834: 821: 815: 814: 812: 800: 794: 793: 791: 778: 772: 771: 769: 767: 752: 746: 745: 743: 731: 725: 724: 723: 721: 708: 702: 701: 699: 687: 681: 677: 671: 670: 668: 666: 661:. 18 August 2021 651: 645: 641:. Available at: 635: 320:Related concepts 302:Stable Diffusion 283:Stable Diffusion 200:machine learning 130:self-supervision 29:machine learning 23:, also known as 21:foundation model 4399: 4398: 4394: 4393: 4392: 4390: 4389: 4388: 4349: 4348: 4347: 4342: 4328: 4267: 4223:Steve Omohundro 4203:Geoffrey Hinton 4193:Stephen Hawking 4178:Paul Christiano 4158:Scott Alexander 4146: 4117:Google DeepMind 4065: 4051:Suffering risks 3969: 3960: 3930: 3925: 3877: 3791: 3757:Google DeepMind 3735: 3701:Geoffrey Hinton 3660: 3597: 3523:Project Debater 3469: 3367:Implementations 3362: 3316: 3280: 3223: 3165:Backpropagation 3099: 3085:Tensor calculus 3039: 3036: 3006: 3001: 2970: 2950:Syntax guessing 2932: 2925: 2911:Predictive text 2906:Grammar checker 2887: 2880: 2852: 2819: 2808: 2774:Bank of English 2757: 2685: 2676: 2667: 2598: 2555: 2523: 2475: 2377:Distant reading 2352:Argument mining 2338: 2334:Text processing 2280: 2275: 2245: 2244: 2224: 2220: 2200:Solaiman, Irene 2197: 2193: 2178: 2174: 2164: 2162: 2153: 2152: 2148: 2138: 2136: 2128: 2127: 2123: 2113: 2111: 2109:Washington Post 2101: 2097: 2087: 2085: 2075: 2071: 2063: 2059: 2058: 2054: 2044: 2042: 2034: 2033: 2029: 2019: 2017: 2009: 2008: 2004: 1984: 1980: 1970: 1968: 1960: 1959: 1955: 1922: 1918: 1908: 1906: 1898: 1897: 1893: 1883: 1881: 1873: 1872: 1868: 1858: 1856: 1847: 1846: 1842: 1832: 1830: 1822: 1821: 1817: 1797: 1793: 1784: 1782: 1775: 1774: 1770: 1760: 1758: 1750: 1749: 1745: 1735: 1733: 1725: 1724: 1720: 1700: 1696: 1686: 1684: 1676: 1675: 1671: 1651: 1647: 1638: 1634: 1614: 1610: 1603: 1577: 1573: 1566: 1534: 1530: 1510: 1506: 1486: 1482: 1470: 1464: 1460: 1440: 1436: 1416: 1412: 1402: 1400: 1392: 1391: 1387: 1352: 1348: 1333: 1329: 1309: 1305: 1233: 1229: 1205: 1198: 1188: 1186: 1177: 1176: 1172: 1162: 1160: 1151: 1150: 1146: 1098: 1091: 1083: 1079: 1078: 1074: 1064: 1062: 1060:The White House 1052: 1048: 1038: 1036: 1026: 1022: 1012: 1010: 1000: 996: 986: 984: 975: 974: 970: 953: 946: 939: 935: 917: 913: 903: 901: 891: 887: 870: 866: 856: 854: 844: 840: 822: 818: 801: 797: 779: 775: 765: 763: 754: 753: 749: 732: 728: 719: 717: 716:, 28 April 2022 710: 709: 705: 688: 684: 678: 674: 664: 662: 653: 652: 648: 636: 625: 620: 596:Google DeepMind 554: 522:Microsoft Azure 509: 492: 464: 444: 431: 418: 410:domain complete 401: 388: 383: 358: 327: 325:Frontier models 322: 312:, Llama 2, and 293:, LLaMA 2, and 198:to data-driven 176: 100: 60:language models 17: 12: 11: 5: 4397: 4387: 4386: 4381: 4376: 4371: 4366: 4361: 4344: 4343: 4333: 4330: 4329: 4327: 4326: 4321: 4314: 4307: 4300: 4293: 4288: 4281: 4275: 4273: 4269: 4268: 4266: 4265: 4260: 4255: 4250: 4245: 4240: 4235: 4230: 4225: 4220: 4215: 4210: 4205: 4200: 4195: 4190: 4185: 4180: 4175: 4170: 4165: 4160: 4154: 4152: 4148: 4147: 4145: 4144: 4139: 4134: 4129: 4124: 4119: 4114: 4109: 4104: 4099: 4094: 4089: 4084: 4079: 4073: 4071: 4067: 4066: 4064: 4063: 4058: 4053: 4048: 4046:Machine ethics 4043: 4038: 4033: 4028: 4023: 4018: 4013: 4008: 4003: 3998: 3993: 3988: 3983: 3977: 3975: 3971: 3970: 3959: 3958: 3951: 3944: 3936: 3927: 3926: 3924: 3923: 3922: 3921: 3916: 3903: 3902: 3901: 3896: 3882: 3879: 3878: 3876: 3875: 3870: 3865: 3860: 3855: 3850: 3845: 3840: 3835: 3830: 3825: 3820: 3815: 3810: 3805: 3799: 3797: 3793: 3792: 3790: 3789: 3784: 3779: 3774: 3769: 3764: 3759: 3754: 3749: 3743: 3741: 3737: 3736: 3734: 3733: 3731:Ilya Sutskever 3728: 3723: 3718: 3713: 3708: 3703: 3698: 3696:Demis Hassabis 3693: 3688: 3686:Ian Goodfellow 3683: 3678: 3672: 3670: 3666: 3665: 3662: 3661: 3659: 3658: 3653: 3652: 3651: 3641: 3636: 3631: 3626: 3621: 3616: 3611: 3605: 3603: 3599: 3598: 3596: 3595: 3590: 3585: 3580: 3575: 3570: 3565: 3560: 3555: 3550: 3545: 3540: 3535: 3530: 3525: 3520: 3515: 3514: 3513: 3503: 3498: 3493: 3488: 3483: 3477: 3475: 3471: 3470: 3468: 3467: 3462: 3461: 3460: 3455: 3445: 3444: 3443: 3438: 3433: 3423: 3418: 3413: 3408: 3403: 3398: 3393: 3388: 3383: 3377: 3375: 3368: 3364: 3363: 3361: 3360: 3355: 3350: 3345: 3340: 3335: 3330: 3324: 3322: 3318: 3317: 3315: 3314: 3309: 3304: 3299: 3294: 3288: 3286: 3282: 3281: 3279: 3278: 3277: 3276: 3269:Language model 3266: 3261: 3256: 3255: 3254: 3244: 3243: 3242: 3231: 3229: 3225: 3224: 3222: 3221: 3219:Autoregression 3216: 3211: 3210: 3209: 3199: 3197:Regularization 3194: 3193: 3192: 3187: 3182: 3172: 3167: 3162: 3160:Loss functions 3157: 3152: 3147: 3142: 3137: 3136: 3135: 3125: 3120: 3119: 3118: 3107: 3105: 3101: 3100: 3098: 3097: 3095:Inductive bias 3092: 3087: 3082: 3077: 3072: 3067: 3062: 3057: 3049: 3047: 3041: 3040: 3035: 3034: 3027: 3020: 3012: 3003: 3002: 3000: 2999: 2994: 2989: 2984: 2978: 2976: 2972: 2971: 2969: 2968: 2963: 2958: 2953: 2943: 2937: 2935: 2933:user interface 2927: 2926: 2924: 2923: 2918: 2913: 2908: 2903: 2898: 2892: 2890: 2882: 2881: 2879: 2878: 2873: 2868: 2862: 2860: 2854: 2853: 2851: 2850: 2845: 2840: 2835: 2830: 2824: 2822: 2814: 2813: 2810: 2809: 2807: 2806: 2801: 2796: 2791: 2786: 2781: 2776: 2771: 2765: 2763: 2759: 2758: 2756: 2755: 2750: 2745: 2740: 2735: 2730: 2725: 2720: 2715: 2710: 2705: 2700: 2695: 2689: 2687: 2678: 2669: 2668: 2666: 2665: 2660: 2658:Word embedding 2655: 2650: 2645: 2638:Language model 2635: 2630: 2625: 2620: 2615: 2609: 2607: 2600: 2599: 2597: 2596: 2591: 2589:Transfer-based 2586: 2581: 2576: 2571: 2565: 2563: 2557: 2556: 2554: 2553: 2548: 2543: 2537: 2535: 2529: 2528: 2525: 2524: 2522: 2521: 2516: 2511: 2506: 2501: 2496: 2491: 2485: 2483: 2474: 2473: 2468: 2463: 2458: 2453: 2448: 2442: 2441: 2436: 2431: 2426: 2421: 2416: 2411: 2410: 2409: 2404: 2394: 2389: 2384: 2379: 2374: 2369: 2364: 2362:Concept mining 2359: 2354: 2348: 2346: 2340: 2339: 2337: 2336: 2331: 2326: 2321: 2316: 2315: 2314: 2309: 2299: 2294: 2288: 2286: 2282: 2281: 2274: 2273: 2266: 2259: 2251: 2243: 2242: 2218: 2191: 2172: 2146: 2121: 2095: 2069: 2052: 2040:www.surgehq.ai 2027: 2002: 1978: 1953: 1916: 1891: 1866: 1853:huggingface.co 1840: 1815: 1791: 1768: 1743: 1718: 1694: 1669: 1645: 1632: 1608: 1601: 1571: 1564: 1528: 1504: 1480: 1458: 1434: 1410: 1385: 1346: 1327: 1303: 1227: 1196: 1170: 1144: 1108:(1): 140–146, 1089: 1072: 1046: 1020: 994: 968: 944: 933: 911: 885: 864: 838: 816: 795: 773: 762:. 28 July 2023 747: 726: 703: 682: 672: 646: 622: 621: 619: 616: 553: 550: 508: 505: 491: 488: 463: 460: 443: 440: 430: 427: 417: 414: 400: 397: 387: 384: 382: 379: 357: 354: 349: 348: 345: 342: 339: 326: 323: 321: 318: 196:expert systems 175: 172: 167: 166: 155: 144: 133: 99: 96: 25:large AI model 15: 9: 6: 4: 3: 2: 4396: 4385: 4384:Deep learning 4382: 4380: 4377: 4375: 4372: 4370: 4367: 4365: 4362: 4360: 4357: 4356: 4354: 4341: 4331: 4325: 4322: 4320: 4319: 4315: 4313: 4312: 4308: 4306: 4305: 4304:The Precipice 4301: 4299: 4298: 4294: 4292: 4289: 4287: 4286: 4282: 4280: 4277: 4276: 4274: 4270: 4264: 4261: 4259: 4256: 4254: 4253:Frank Wilczek 4251: 4249: 4246: 4244: 4241: 4239: 4236: 4234: 4231: 4229: 4226: 4224: 4221: 4219: 4216: 4214: 4211: 4209: 4206: 4204: 4201: 4199: 4198:Dan Hendrycks 4196: 4194: 4191: 4189: 4186: 4184: 4181: 4179: 4176: 4174: 4171: 4169: 4168:Yoshua Bengio 4166: 4164: 4161: 4159: 4156: 4155: 4153: 4149: 4143: 4140: 4138: 4135: 4133: 4130: 4128: 4125: 4123: 4120: 4118: 4115: 4113: 4110: 4108: 4105: 4103: 4100: 4098: 4095: 4093: 4090: 4088: 4085: 4083: 4080: 4078: 4075: 4074: 4072: 4070:Organizations 4068: 4062: 4059: 4057: 4054: 4052: 4049: 4047: 4044: 4042: 4039: 4037: 4034: 4032: 4029: 4027: 4024: 4022: 4019: 4017: 4014: 4012: 4009: 4007: 4004: 4002: 3999: 3997: 3994: 3992: 3989: 3987: 3984: 3982: 3979: 3978: 3976: 3972: 3968: 3964: 3957: 3952: 3950: 3945: 3943: 3938: 3937: 3934: 3920: 3917: 3915: 3912: 3911: 3904: 3900: 3897: 3895: 3892: 3891: 3888: 3884: 3883: 3880: 3874: 3871: 3869: 3866: 3864: 3861: 3859: 3856: 3854: 3851: 3849: 3846: 3844: 3841: 3839: 3836: 3834: 3831: 3829: 3826: 3824: 3821: 3819: 3816: 3814: 3811: 3809: 3806: 3804: 3801: 3800: 3798: 3796:Architectures 3794: 3788: 3785: 3783: 3780: 3778: 3775: 3773: 3770: 3768: 3765: 3763: 3760: 3758: 3755: 3753: 3750: 3748: 3745: 3744: 3742: 3740:Organizations 3738: 3732: 3729: 3727: 3724: 3722: 3719: 3717: 3714: 3712: 3709: 3707: 3704: 3702: 3699: 3697: 3694: 3692: 3689: 3687: 3684: 3682: 3679: 3677: 3676:Yoshua Bengio 3674: 3673: 3671: 3667: 3657: 3656:Robot control 3654: 3650: 3647: 3646: 3645: 3642: 3640: 3637: 3635: 3632: 3630: 3627: 3625: 3622: 3620: 3617: 3615: 3612: 3610: 3607: 3606: 3604: 3600: 3594: 3591: 3589: 3586: 3584: 3581: 3579: 3576: 3574: 3573:Chinchilla AI 3571: 3569: 3566: 3564: 3561: 3559: 3556: 3554: 3551: 3549: 3546: 3544: 3541: 3539: 3536: 3534: 3531: 3529: 3526: 3524: 3521: 3519: 3516: 3512: 3509: 3508: 3507: 3504: 3502: 3499: 3497: 3494: 3492: 3489: 3487: 3484: 3482: 3479: 3478: 3476: 3472: 3466: 3463: 3459: 3456: 3454: 3451: 3450: 3449: 3446: 3442: 3439: 3437: 3434: 3432: 3429: 3428: 3427: 3424: 3422: 3419: 3417: 3414: 3412: 3409: 3407: 3404: 3402: 3399: 3397: 3394: 3392: 3389: 3387: 3384: 3382: 3379: 3378: 3376: 3372: 3369: 3365: 3359: 3356: 3354: 3351: 3349: 3346: 3344: 3341: 3339: 3336: 3334: 3331: 3329: 3326: 3325: 3323: 3319: 3313: 3310: 3308: 3305: 3303: 3300: 3298: 3295: 3293: 3290: 3289: 3287: 3283: 3275: 3272: 3271: 3270: 3267: 3265: 3262: 3260: 3257: 3253: 3252:Deep learning 3250: 3249: 3248: 3245: 3241: 3238: 3237: 3236: 3233: 3232: 3230: 3226: 3220: 3217: 3215: 3212: 3208: 3205: 3204: 3203: 3200: 3198: 3195: 3191: 3188: 3186: 3183: 3181: 3178: 3177: 3176: 3173: 3171: 3168: 3166: 3163: 3161: 3158: 3156: 3153: 3151: 3148: 3146: 3143: 3141: 3140:Hallucination 3138: 3134: 3131: 3130: 3129: 3126: 3124: 3121: 3117: 3114: 3113: 3112: 3109: 3108: 3106: 3102: 3096: 3093: 3091: 3088: 3086: 3083: 3081: 3078: 3076: 3073: 3071: 3068: 3066: 3063: 3061: 3058: 3056: 3055: 3051: 3050: 3048: 3046: 3042: 3033: 3028: 3026: 3021: 3019: 3014: 3013: 3010: 2998: 2995: 2993: 2990: 2988: 2987:Hallucination 2985: 2983: 2980: 2979: 2977: 2973: 2967: 2964: 2962: 2959: 2957: 2954: 2951: 2947: 2944: 2942: 2939: 2938: 2936: 2934: 2928: 2922: 2921:Spell checker 2919: 2917: 2914: 2912: 2909: 2907: 2904: 2902: 2899: 2897: 2894: 2893: 2891: 2889: 2883: 2877: 2874: 2872: 2869: 2867: 2864: 2863: 2861: 2859: 2855: 2849: 2846: 2844: 2841: 2839: 2836: 2834: 2831: 2829: 2826: 2825: 2823: 2821: 2815: 2805: 2802: 2800: 2797: 2795: 2792: 2790: 2787: 2785: 2782: 2780: 2777: 2775: 2772: 2770: 2767: 2766: 2764: 2760: 2754: 2751: 2749: 2746: 2744: 2741: 2739: 2736: 2734: 2733:Speech corpus 2731: 2729: 2726: 2724: 2721: 2719: 2716: 2714: 2713:Parallel text 2711: 2709: 2706: 2704: 2701: 2699: 2696: 2694: 2691: 2690: 2688: 2682: 2679: 2674: 2670: 2664: 2661: 2659: 2656: 2654: 2651: 2649: 2646: 2643: 2639: 2636: 2634: 2631: 2629: 2626: 2624: 2621: 2619: 2616: 2614: 2611: 2610: 2608: 2605: 2601: 2595: 2592: 2590: 2587: 2585: 2582: 2580: 2577: 2575: 2574:Example-based 2572: 2570: 2567: 2566: 2564: 2562: 2558: 2552: 2549: 2547: 2544: 2542: 2539: 2538: 2536: 2534: 2530: 2520: 2517: 2515: 2512: 2510: 2507: 2505: 2504:Text chunking 2502: 2500: 2497: 2495: 2494:Lemmatisation 2492: 2490: 2487: 2486: 2484: 2482: 2478: 2472: 2469: 2467: 2464: 2462: 2459: 2457: 2454: 2452: 2449: 2447: 2444: 2443: 2440: 2437: 2435: 2432: 2430: 2427: 2425: 2422: 2420: 2417: 2415: 2412: 2408: 2405: 2403: 2400: 2399: 2398: 2395: 2393: 2390: 2388: 2385: 2383: 2380: 2378: 2375: 2373: 2370: 2368: 2365: 2363: 2360: 2358: 2355: 2353: 2350: 2349: 2347: 2345: 2344:Text analysis 2341: 2335: 2332: 2330: 2327: 2325: 2322: 2320: 2317: 2313: 2310: 2308: 2305: 2304: 2303: 2300: 2298: 2295: 2293: 2290: 2289: 2287: 2285:General terms 2283: 2279: 2272: 2267: 2265: 2260: 2258: 2253: 2252: 2249: 2238: 2233: 2229: 2222: 2214: 2209: 2205: 2201: 2195: 2187: 2186:Stanford CRFM 2183: 2176: 2160: 2156: 2150: 2135: 2131: 2125: 2110: 2106: 2099: 2084: 2080: 2073: 2062: 2056: 2041: 2037: 2031: 2016: 2012: 2006: 1998: 1993: 1989: 1982: 1967: 1963: 1957: 1949: 1945: 1940: 1935: 1931: 1927: 1920: 1905: 1901: 1895: 1880: 1876: 1870: 1854: 1850: 1844: 1829: 1825: 1819: 1811: 1806: 1802: 1795: 1780: 1779: 1772: 1757: 1753: 1747: 1732: 1728: 1722: 1714: 1709: 1705: 1698: 1683: 1679: 1673: 1665: 1660: 1656: 1649: 1642: 1636: 1628: 1623: 1619: 1612: 1604: 1598: 1594: 1590: 1586: 1582: 1575: 1567: 1561: 1557: 1553: 1548: 1543: 1539: 1532: 1524: 1519: 1515: 1508: 1500: 1495: 1491: 1484: 1476: 1469: 1462: 1454: 1449: 1445: 1438: 1430: 1425: 1421: 1414: 1399: 1395: 1389: 1381: 1377: 1373: 1369: 1365: 1361: 1357: 1350: 1342: 1338: 1331: 1323: 1318: 1314: 1307: 1299: 1295: 1290: 1285: 1281: 1277: 1273: 1269: 1265: 1261: 1256: 1251: 1247: 1243: 1239: 1231: 1223: 1218: 1214: 1210: 1203: 1201: 1185:. 6 June 2023 1184: 1180: 1174: 1158: 1154: 1148: 1141: 1137: 1133: 1129: 1125: 1121: 1116: 1111: 1107: 1103: 1096: 1094: 1082: 1076: 1061: 1057: 1050: 1035: 1031: 1024: 1009: 1008:Stanford CRFM 1005: 998: 982: 978: 972: 963: 958: 951: 949: 942: 937: 928: 923: 915: 900: 896: 889: 880: 875: 868: 853: 849: 842: 833: 828: 820: 811: 806: 799: 790: 785: 777: 761: 757: 751: 742: 737: 730: 715: 714: 707: 698: 693: 686: 676: 660: 656: 650: 644: 640: 634: 632: 630: 628: 623: 615: 613: 609: 605: 601: 597: 592: 588: 586: 582: 578: 574: 569: 567: 563: 558: 549: 545: 541: 537: 529: 525: 523: 519: 515: 504: 500: 498: 487: 485: 481: 477: 473: 468: 459: 457: 453: 448: 439: 435: 426: 422: 413: 411: 405: 396: 394: 378: 376: 370: 368: 364: 353: 346: 343: 340: 337: 336: 335: 331: 317: 315: 311: 307: 303: 298: 296: 292: 288: 284: 280: 276: 272: 268: 264: 260: 256: 252: 247: 245: 241: 237: 233: 229: 224: 222: 218: 213: 209: 203: 201: 197: 191: 189: 185: 181: 171: 164: 160: 156: 153: 149: 145: 142: 138: 134: 131: 127: 126: 121: 120: 119: 116: 114: 110: 106: 95: 93: 89: 85: 81: 77: 73: 69: 65: 61: 56: 53: 48: 46: 42: 41:generative AI 38: 34: 33:deep learning 30: 26: 22: 4316: 4309: 4302: 4295: 4283: 4243:Jaan Tallinn 4183:Eric Drexler 4173:Nick Bostrom 3986:AI alignment 3762:Hugging Face 3726:David Silver 3374:Audio–visual 3228:Applications 3207:Augmentation 3052: 2901:Concordancer 2297:Bag-of-words 2227: 2221: 2203: 2194: 2185: 2175: 2163:. Retrieved 2158: 2149: 2137:. Retrieved 2133: 2124: 2112:. Retrieved 2108: 2098: 2086:. Retrieved 2082: 2072: 2055: 2043:. Retrieved 2039: 2030: 2018:. Retrieved 2014: 2005: 1987: 1981: 1969:. Retrieved 1965: 1956: 1929: 1919: 1907:. Retrieved 1903: 1894: 1882:. Retrieved 1878: 1869: 1857:. Retrieved 1852: 1843: 1831:. Retrieved 1827: 1818: 1800: 1794: 1783:, retrieved 1777: 1771: 1759:. Retrieved 1755: 1746: 1734:. Retrieved 1730: 1721: 1703: 1697: 1685:. Retrieved 1681: 1672: 1654: 1648: 1635: 1617: 1611: 1584: 1574: 1537: 1531: 1513: 1507: 1489: 1483: 1474: 1461: 1443: 1437: 1419: 1413: 1401:. Retrieved 1397: 1388: 1363: 1359: 1349: 1340: 1330: 1312: 1306: 1245: 1241: 1230: 1212: 1187:. Retrieved 1182: 1173: 1161:. Retrieved 1156: 1147: 1105: 1101: 1075: 1063:. Retrieved 1059: 1049: 1037:. Retrieved 1034:The Gradient 1033: 1023: 1011:. Retrieved 1007: 997: 985:. Retrieved 981:Stanford HAI 980: 971: 936: 914: 902:. Retrieved 898: 888: 867: 855:. Retrieved 851: 841: 819: 798: 776: 764:. Retrieved 759: 750: 729: 718:, retrieved 712: 706: 685: 675: 663:. Retrieved 659:Stanford HAI 658: 649: 638: 593: 589: 570: 559: 555: 546: 542: 538: 534: 518:Google Cloud 510: 507:Supply chain 501: 493: 469: 465: 449: 445: 436: 432: 423: 419: 409: 406: 402: 389: 371: 359: 350: 332: 328: 299: 248: 225: 204: 192: 177: 168: 162: 139:(D, VA) and 123: 117: 108: 101: 92:times-series 71: 67: 57: 49: 24: 20: 18: 4248:Max Tegmark 4233:Martin Rees 4041:Longtermism 4001:AI takeover 3910:Categories 3858:Autoencoder 3813:Transformer 3681:Alex Graves 3629:OpenAI Five 3533:IBM Watsonx 3155:Convolution 3133:Overfitting 2858:Topic model 2738:Text corpus 2584:Statistical 2451:Text mining 2292:AI-complete 2165:13 February 2139:13 February 2114:13 February 2088:13 February 1971:13 February 1403:12 February 1209:Leung, Jade 1189:12 February 1163:12 February 1065:12 February 1039:11 December 1013:11 December 899:Turing Post 857:11 December 766:11 December 480:fine-tuning 393:Transformer 255:Transformer 234:, BERT and 152:E.U. AI Act 98:Definitions 74:series and 62:(LMs) like 4353:Categories 4213:Shane Legg 4188:Sam Harris 4163:Sam Altman 4102:EleutherAI 3899:Technology 3752:EleutherAI 3711:Fei-Fei Li 3706:Yann LeCun 3619:Q-learning 3602:Decisional 3528:IBM Watson 3436:Midjourney 3328:TensorFlow 3175:Activation 3128:Regression 3123:Clustering 2579:Rule-based 2461:Truecasing 2329:Stop words 2237:2204.14198 2213:2302.04844 1997:2311.01550 1939:2005.00955 1810:2206.04615 1713:2311.16502 1664:2106.10199 1627:2005.14165 1547:1912.10389 1523:2001.08361 1499:2103.00020 1453:2310.12941 1429:2303.15772 1322:2303.13375 1255:2212.13138 1222:2307.03718 1115:2211.09110 962:2108.07258 927:2310.10631 879:2305.06161 810:2307.14334 789:2309.06126 741:2306.05284 697:2002.12327 618:References 490:Evaluation 462:Adaptation 221:Tensorflow 141:Anna Eshoo 4228:Huw Price 4218:Elon Musk 4122:Humanity+ 3996:AI safety 3782:MIT CSAIL 3747:Anthropic 3716:Andrew Ng 3614:AlphaZero 3458:VideoPoet 3421:AlphaFold 3358:MindSpore 3312:SpiNNaker 3307:Memristor 3214:Diffusion 3190:Rectifier 3170:Batchnorm 3150:Attention 3145:Adversary 2888:reviewing 2686:standards 2684:Types and 2015:scale.com 1380:229386991 1280:1476-4687 608:black box 472:prompting 450:However, 375:EU AI Act 137:Don Beyer 4340:Category 4208:Bill Joy 3974:Concepts 3890:Portals 3649:Auto-GPT 3481:Word2vec 3285:Hardware 3202:Datasets 3104:Concepts 2804:Wikidata 2784:FrameNet 2769:BabelNet 2748:Treebank 2718:PropBank 2663:Word2vec 2628:fastText 2509:Stemming 2045:21 April 2020:21 April 1909:21 April 1884:21 April 1859:21 April 1833:21 April 1785:21 April 1761:21 April 1736:21 April 1687:21 April 1298:37438534 1289:10396962 1140:37230490 904:11 April 456:break(s) 399:Training 386:Modeling 277:, CLIP, 240:word2vec 212:ImageNet 113:modality 109:function 88:MusicGen 3772:Meta AI 3609:AlphaGo 3593:PanGu-Σ 3563:ChatGPT 3538:Granite 3486:Seq2seq 3465:Whisper 3386:WaveNet 3381:AlexNet 3353:Flux.jl 3333:PyTorch 3185:Sigmoid 3180:Softmax 3045:General 2975:Related 2941:Chatbot 2799:WordNet 2779:DBpedia 2653:Seq2seq 2397:Parsing 2312:Trigram 1260:Bibcode 1157:Mozilla 1120:Bibcode 827:bioRxiv 720:13 June 665:11 June 585:Mistral 581:Granite 577:Llama 2 442:Scaling 429:Systems 363:ChatGPT 314:Mistral 306:ChatGPT 295:Mistral 217:Pytorch 208:AlexNet 174:History 45:ChatGPT 27:, is a 4151:People 4142:OpenAI 3787:Huawei 3767:OpenAI 3669:People 3639:MuZero 3501:Gemini 3496:Claude 3431:DALL-E 3343:Theano 2948:(c.f. 2606:models 2594:Neural 2307:Bigram 2302:n-gram 1599:  1562:  1378:  1366:: 83. 1296:  1286:  1278:  1242:Nature 1138:  987:22 May 829:  760:Google 600:OpenAI 583:, and 573:PaLM 2 367:DALL-E 279:DALL-E 251:NVIDIA 186:, and 84:DALL-E 76:Google 64:OpenAI 4272:Other 3965:from 3853:Mamba 3624:SARSA 3588:LLaMA 3583:BLOOM 3568:GPT-J 3558:GPT-4 3553:GPT-3 3548:GPT-2 3543:GPT-1 3506:LaMDA 3338:Keras 2997:spaCy 2642:large 2633:GloVe 2232:arXiv 2208:arXiv 2064:(PDF) 1992:arXiv 1934:arXiv 1805:arXiv 1708:arXiv 1659:arXiv 1622:arXiv 1542:arXiv 1518:arXiv 1494:arXiv 1471:(PDF) 1448:arXiv 1424:arXiv 1376:S2CID 1317:arXiv 1250:arXiv 1217:arXiv 1110:arXiv 1084:(PDF) 957:arXiv 922:arXiv 874:arXiv 805:arXiv 784:arXiv 736:arXiv 692:arXiv 604:GPT-4 310:LLaMA 291:LLaMA 287:GPT-4 275:GPT-3 267:GPT-2 244:GloVe 236:GPT-2 70:GPT-n 3777:Mila 3578:PaLM 3511:Bard 3491:BERT 3474:Text 3453:Sora 2762:Data 2613:BERT 2167:2024 2159:TIME 2141:2024 2116:2024 2090:2024 2047:2024 2022:2024 1973:2024 1911:2024 1886:2024 1861:2024 1835:2024 1787:2024 1763:2024 1738:2024 1689:2024 1597:ISBN 1560:ISBN 1405:2024 1294:PMID 1276:ISSN 1191:2024 1165:2024 1136:PMID 1106:1525 1067:2024 1041:2023 1015:2023 989:2023 906:2024 859:2023 768:2023 722:2022 667:2022 612:Meta 497:MMLU 484:LoRA 416:Data 304:and 263:BERT 242:and 228:ELMo 219:and 80:BERT 3981:AGI 3518:NMT 3401:OCR 3396:HWR 3348:JAX 3302:VPU 3297:TPU 3292:IPU 3116:SGD 2794:UBY 1944:doi 1589:doi 1552:doi 1368:doi 1284:PMC 1268:doi 1246:620 1128:doi 602:'s 562:API 524:). 365:or 259:GPT 232:GPT 161:'s 78:'s 66:'s 31:or 4355:: 2230:, 2206:, 2184:. 2157:. 2132:. 2107:. 2081:. 2038:. 2013:. 1990:, 1964:. 1942:. 1928:. 1902:. 1877:. 1851:. 1826:. 1803:, 1754:. 1729:. 1706:, 1680:. 1657:, 1620:, 1595:. 1583:. 1558:. 1550:. 1516:, 1492:, 1473:. 1446:, 1422:, 1396:. 1374:. 1364:90 1362:. 1358:. 1339:. 1315:, 1292:. 1282:. 1274:. 1266:. 1258:. 1244:. 1240:. 1215:, 1199:^ 1181:. 1155:. 1134:, 1126:, 1118:, 1104:, 1092:^ 1058:. 1032:. 1006:. 979:. 947:^ 897:. 850:. 758:. 657:. 626:^ 579:, 575:, 568:. 520:, 516:, 482:, 478:, 474:, 369:. 289:, 285:, 281:, 273:, 271:T5 269:, 265:, 261:, 230:, 182:, 19:A 3955:e 3948:t 3941:v 3031:e 3024:t 3017:v 2952:) 2675:, 2644:) 2640:( 2270:e 2263:t 2256:v 2234:: 2210:: 2188:. 2169:. 2143:. 2118:. 2092:. 2049:. 2024:. 1994:: 1975:. 1950:. 1946:: 1936:: 1913:. 1888:. 1863:. 1837:. 1807:: 1765:. 1740:. 1710:: 1691:. 1661:: 1624:: 1605:. 1591:: 1568:. 1554:: 1544:: 1520:: 1496:: 1477:. 1450:: 1426:: 1407:. 1382:. 1370:: 1343:. 1319:: 1300:. 1270:: 1262:: 1252:: 1219:: 1193:. 1167:. 1130:: 1122:: 1112:: 1086:. 1069:. 1043:. 1017:. 991:. 965:. 959:: 930:. 924:: 908:. 882:. 876:: 861:. 835:. 813:. 807:: 792:. 786:: 770:. 744:. 738:: 700:. 694:: 669:. 72:" 68:"

Index

machine learning
deep learning
artificial intelligence
generative AI
ChatGPT
general-purpose technologies
language models
OpenAI
Google
BERT
DALL-E
MusicGen
times-series
(large) language model
modality
Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence
self-supervision
Don Beyer
Anna Eshoo
European Parliament
E.U. AI Act
Competition and Markets Authority
deep neural networks
transfer learning
self-supervised learning
expert systems
machine learning
AlexNet
ImageNet
Pytorch

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.