National Artificial Intelligence Research Resource Pilot | NSF
Contribution: Access to a fully open ecosystem of data, models, training and evaluation software necessary to support a scientific approach to AI research. Initial access will be provided to AI2 Dolma, the largest open dataset to support language model pre-training.
Contribution: Support for at least 20 research projects through access to AWS credits for storage, compute and AI services to build, train and deploy machine learning (ML) models. AWS will also work with the NAIRR pilot to publicize AWS resources to accelerate AI research, including access to pre-trained and customizable AI/ML models, AI/ML training resources and 500+ datasets through the Registry of Open Data on AWS.
Contribution: API access to Anthropic’s Claude model for 10 researchers working on climate change-related projects. Anthropic will also provide educational resources to help those researchers experiment with prompt engineering.
Contribution: Application support for NAIRR pilot users running applications on AMD hardware and collaboration with cloud vendor sites offering access to AMD hardware.
Contribution: Access to Cerebras systems and clusters, providing up to four EXAFLOPs of AI compute for NAIRR pilot projects and users and enabling them to rapidly train AI models. Cerebras will also contribute access to open-source datasets and models, and time from its expert data scientists and AI researchers to facilitate project selection, definition and success.
Contribution: Access to Databricks’ Data Intelligence Platform for NAIRR pilot users. This will facilitate use of Databricks’ data processing tools by the research community to analyze existing datasets and create entirely new ones.
Contribution: $2.6 million in the form of access to Datavant’s privacy-preserving record linkage platform, privacy-preserving data discovery tools, and staff expertise in support of the NAIRR Secure NIH component of the pilot as well as elements of the future NAIRR software stack.
Contribution: 100,000 hours of GPU access to support an effort in the research community to train a foundation model for scientific research, in addition to support and expertise for using EleutherAI’s large language model training library on high-performance computing systems.
Contribution: Collaboration across Google Colab, Kaggle and Data Commons programs, including licenses for Colab’s virtual notebooks, integration of Kaggle public resources onto NAIRR pilot infrastructure, and partnership on competitions and red teaming of models. In addition, Data Commons will co-locate an instance of Data Commons with NAIRR pilot computing infrastructure to facilitate the ability of the research community to use its diverse, integrated datasets.
Groq is providing access for up to 10 research teams to use Groq’s Language Processing Unit (LPU) Inference Engine via GroqCloud.
Proposed contribution: Time on GPU-powered supercomputing platforms and discounts on supercomputing resources for potential expansion of the NAIRR pilot. In addition, HPE will provide licenses to HPE Machine Learning Development Environment and HPE Machine Learning Data Management Software along with hands-on training for researchers who will have access to datasets, digital twins and performance and productivity tools.
Contribution: 100 compute grants for NAIRR pilot projects and participants to support access to Hugging Face Spaces demos of systems and model evaluation, inference and fine-tuning. Hugging Face will also partner with the NAIRR pilot to set up sharing and evaluation leaderboards for datasets and models developed through or hosted by the pilot.
Contribution: Datasets and benchmarks focused on AI safety and trust evaluation as well as geospatial, time series, materials and chemistry foundation models. IBM will also provide expertise and assistance to researchers working with these resources.
Contribution: Technical training on Intel server platforms, AI technologies and software optimization for NAIRR pilot users working with Intel hardware.
Contribution: Access to Lexset’s Seahaven synthetic data creation software and tools. Lexset will also provide staff time and expertise to support research community data creation projects.
Contribution: Collaboration with NAIRR pilot researchers to support research on Meta’s Llama suite of models, consistent with applicable model licenses.
Contribution: $20 million in compute credits on Microsoft Azure, along with access to leading-edge models, including those available via Azure OpenAI Service. Availability of state-of-the-art resources for developing trustworthy and responsible AI applications, including tools for research and development (R&D) on AI fairness, accuracy, reliability, transparency, privacy and security, and model orchestration. Offerings include resources to enable HIPAA-compliant computing in support of health care research, access to innovative tools for scientific discovery through Azure Quantum Elements, and opportunities to forge collaborative relationships with Microsoft’s scientists and engineers.
Contribution: Access to the MLCommons technology platform to enable testing of AI systems as well as access to AI benchmarks and the suite of MLPerf training, inference and storage benchmarks. MLCommons will also provide hosting services for select open datasets developed by the NAIRR pilot user community.
Contribution: $30 million in overall support for the pilot, including $24 million worth of computing on NVIDIA’s DGX Cloud platform integrated with NVIDIA AI software tools and supported by technical subject-matter experts to assist NAIRR pilot users. In addition, NVIDIA will provide AI software platform licenses to national supercomputing centers integrated with the NAIRR pilot, and run deep-learning workshops, AI boot camps and AI hackathons for NAIRR pilot users.
Contribution: $500,000 in support of the pilot effort, which includes computing infrastructure, dedicated staff time and expertise from portfolio partners. Additionally, Omidyar Network will co-sponsor workshops and future calls for proposals, fostering an inclusive environment for innovation and knowledge-sharing.
Contribution: Up to $1 million in credits for model access for research related to AI safety, evaluations and societal impacts, and up to $250,000 in model access and/or ChatGPT accounts to support applied research and coursework at historically Black colleges and universities and minority-serving institutions. Additionally, OpenAI will provide next-generation AI technology and data processing tools to aid in the digitization and structuring of large-scale datasets that are not currently available for research.
Contribution: Access to an integrated architecture based on privacy-enhancing technologies and up to $500,000 in cloud credits to support research partnerships working to develop AI solutions with sensitive or distributed data.
Contribution: Supports the NAIRR Secure pilot and the expansion of the National Clinical Cohort Collaborative through access to Palantir’s Foundry and AIP platforms deployed at the National Institutes of Health, including compute hours and platform support resources.
Contribution: Large-scale clinical datasets derived from real-world health care data in support of the NAIRR Secure pilot to bolster AI research and innovation in health care.
Contribution: Hosted environment for NAIRR pilot researchers to access pre-configured generative AI and large language models for domain specific fine-tuning and experimentation. Researchers will receive training and technical support to ensure project success. Technical assistance will also be provided for researchers utilizing the SambaNova cluster deployed at Argonne National Laboratory.
Contribution: $2.4 million in the form of access to Vocareum notebooks and cloud resources for 20,000 students, enabling educators to deliver hands-on AI education in their classrooms.
Contribution: One million GPU-hour compute credits, localized storage and associated staff time and expertise to support projects for the research community.
Contribution: Free academic licenses for NAIRR pilot users to access Weights & Biases’ AI Developer Platform, and technical support to maximize impact from use of the platform.
link
