Information
## Purpose
A valuable collection of latest information and learning resources on NLP, Transformers and Large Language Model.
## Lastest News
- Meta unveils a new, more efficient Llama model : https://techcrunch.com/2024/12/06/meta-unveils-a-new-more-efficient-llama-model - 06/12/2024
- Google’s new generative AI video model is now available : https://www.theverge.com/2024/12/4/24312938/google-veo-generative-ai-video-model-available-preview - 04/12/2024
- DeepMind’s Genie 2 can generate interactive worlds that look like video games : https://techcrunch.com/2024/12/04/deepminds-genie-2-can-generate-interactive-worlds-that-look-like-video-games - 04/12/2024
- Tencent Launches HunyuanVideo, an Open-Source AI Video Model : https://www.maginative.com/article/tencent-launches-hunyuanvideo-an-open-source-ai-video-model - 03/12/2024
- Amazon Is Building a Mega AI Supercomputer With Anthropic : https://www.wired.com/story/amazon-reinvent-anthropic-supercomputer - 03/12/2024
- Amazon announces Nova, a new family of multimodal AI models : https://techcrunch.com/2024/12/03/amazon-announces-nova-a-new-family-of-multimodal-ai-models - 03/12/2024
- PRIME Intellect Releases INTELLECT-1 (Instruct + Base): The First 10B Parameter Language Model Collaboratively Trained Across the Globe : https://www.marktechpost.com/2024/11/29/prime-intellect-releases-intellect-1-instruct-base-the-first-10b-parameter-language-model-collaboratively-trained-across-the-globe - 29/11/2024
- ElevenLabs launches GenFM to turn user content into AI-powered podcasts : https://www.testingcatalog.com/elevenlabs-launches-genfm-to-turn-user-content-into-ai-powered-podcasts - 28/11/2024
- Runway launches Frames — a new AI image generator that creates custom worlds : https://www.tomsguide.com/ai/runway-launches-frames-a-new-ai-image-generator-that-creates-custom-worlds - 26/11/2024
- Introducing FLUX.1 Tools : https://blackforestlabs.ai/flux-1-tools - 21/11/2024
- OpenScholar: The open-source A.I. that’s outperforming GPT-4o in scientific research : https://venturebeat.com/ai/openscholar-the-open-source-a-i-thats-outperforming-gpt-4o-in-scientific-research - 20/11/2024
- DeepSeek’s first reasoning model R1-Lite-Preview turns heads, beating OpenAI o1 performance : https://venturebeat.com/ai/deepseeks-first-reasoning-model-r1-lite-preview-turns-heads-beating-openai-o1-performance - 20/11/2024
- A statistical approach to model evaluations : https://www.anthropic.com/research/statistical-approach-to-model-evals - 20/11/2024
- Suno V4 Ai Music Generator Is Out Now And It’s Very Impressive : https://9meters.com/entertainment/music-entertainment/suno-v4-is-out-now-features-release-date-whats-new - 20/11/2024
- Google’s Gemini chatbot now has memory : https://techcrunch.com/2024/11/19/googles-gemini-chatbot-now-has-memory - 19/11/2024
- Perplexity introduces a shopping feature for Pro users in the US : https://techcrunch.com/2024/11/18/perplexity-introduces-a-shopping-feature-for-pro-users - 18/11/2024
- Mistral unleashes Pixtral Large and upgrades Le Chat into full-on ChatGPT competitor : https://venturebeat.com/ai/mistral-unleashes-pixtral-large-and-upgrades-le-chat-into-full-on-chatgpt-competitor - 18/11/2024
- Watch out, Midjourney — Recraft just announced new AI image generator model : https://www.tomsguide.com/ai/ai-image-video/watch-out-midjourney-recraft-just-announced-new-ai-image-generator-model - 31/10/2024
- GitHub Copilot will support models from Anthropic, Google, and OpenAI : https://www.theverge.com/2024/10/29/24282544/github-copilot-multi-model-anthropic-google-open-ai-github-spark-announcement - 30/10/2024
- Meta AI Silently Releases NotebookLlama: An Open Version of Google’s NotebookLM : https://www.marktechpost.com/2024/10/27/meta-ai-silently-releases-notebookllama-an-open-source-alternative-to-googles-notebooklm - 27/10/2024
- Mochi 1: A new SOTA in open-source video generation models : https://www.genmo.ai/blog - 23/10/2024
- Google DeepMind is making its AI text watermark open source : https://www.technologyreview.com/2024/10/23/1106105/google-deepmind-is-making-its-ai-text-watermark-open-source - 23/10/2024
- OpenAI Unveils Secret Meta Prompt—And It’s Very Different From Anthropic's Approach : https://decrypt.co/285854/openai-secret-meta-prompt-anthropic - 15/10/2024
- INTELLECT-1: Launching the First Decentralized Training of a 10B Parameter Model : https://www.primeintellect.ai/blog/intellect-1 - 11/10/2024
- All Gemini users can now generate images with Imagen 3 : https://9to5google.com/2024/10/09/gemini-imagen-3 - 09/10/2024
- Meta announces Movie Gen, an AI-powered video generator : https://www.theverge.com/2024/10/4/24261990/meta-movie-gen-ai-video-generator-openai-sora - 05/10/2024
- Black Forest Labs releases Flux 1.1 Pro and an API : https://venturebeat.com/ai/black-forest-labs-releases-flux-1-1-pro-and-an-api - 03/10/2024
- Google brings ads to AI Overviews as it expands AI's role in search : https://techcrunch.com/2024/10/03/google-brings-ads-to-ai-overviews-and-rolls-out-ai-organized-pages - 03/10/2024
- Meta Releases Llama 3.2—and Gives Its AI a Voice : https://www.wired.com/story/meta-releases-new-llama-model-ai-voice - 25/09/2024
- Llama 3.2: Revolutionizing edge AI and vision with open, customizable models : https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/ - 25/09/2024
- Introducing Molmo: A Family of State-of-the-Art Open Multimodal Models : https://www.businesswire.com/news/home/20240925326133/en/Introducing-Molmo-A-Family-of-State-of-the-Art-Open-Multimodal-Models - https://molmo.allenai.org/blog - 25/09/2024
- AnyGraph: An Effective and Efficient Graph Foundation Model Designed to Address the Multifaceted Challenges of Structure and Feature Heterogeneity Across Diverse Graph Datasets : https://www.marktechpost.com/2024/09/02/anygraph-an-effective-and-efficient-graph-foundation-model-designed-to-address-the-multifaceted-challenges-of-structure-and-feature-heterogeneity-across-diverse-graph-datasets/ - 02/09/2024
- Forget Sora — MiniMax is a new realistic AI video generator and it’s seriously impressive : https://www.tomsguide.com/ai/ai-image-video/forget-sora-minimax-is-a-new-realistic-ai-video-generator-and-it-is-seriously-impressive - 02/09/2024
- Qwen2-VL: To See the World More Clearly : https://qwenlm.github.io/blog/qwen2-vl/ - 29/08/2024
- 100M Token Context Windows : https://magic.dev/blog/100m-token-context-windows - 29/08/2024
- Alibaba releases new AI model Qwen2-VL that can analyze videos more than 20 minutes long : https://venturebeat.com/ai/alibaba-releases-new-ai-model-qwen2-vl-that-can-analyze-videos-more-than-20-minutes-long - 29/08/2024
- OpenAI Aims to Release New AI Model, ‘Strawberry,’ in Fall : https://www.pymnts.com/news/artificial-intelligence/2024/openai-aims-release-new-ai-model-strawberry-fall/ - 27/08/2024
- Hermes 3 : https://nousresearch.com/hermes3/ - 24/08/2024
- Microsoft reveals Phi-3.5 — this new small AI model outperforms Gemini and GPT-4o : https://www.tomsguide.com/ai/microsoft-reveals-phi-35-this-new-small-ai-model-outperforms-gemini-and-gpt-4o - 23/08/2024
- Ideogram AI expands its features with v2 model and color palette options : https://www.testingcatalog.com/ideogram-ai-expands-its-features-with-v2-model-and-color-palette-options/ - https://about.ideogram.ai/2.0 - 21/08/2024
- Luma drops Dream Machine 1.5 — here’s what’s new : https://www.tomsguide.com/ai/ai-image-video/luma-drops-dream-machine-15-heres-whats-new - 20/08/2024
- Samsung to Adopt High-NA Lithography Alongside Intel, Ahead of TSMC - https://www.extremetech.com/computing/samsung-to-adopt-high-na-lithography-alongside-intel-ahead-of-tsmc - 19/08/2024
- Google Releases Powerful AI Image Generator You Can Use for Free : https://petapixel.com/2024/08/19/google-releases-powerful-ai-image-generator-you-can-use-for-free-imagen-3 - 19/08/2024
- Google’s AI-generated search summaries change how they show their sources : https://www.theverge.com/2024/8/15/24220581/google-search-ai-overviews-links-citations-expanded-rollout - 16/08/2024
- Prompt Caching is Now Available on the Anthropic API for Specific Claude Models : https://www.marktechpost.com/2024/08/15/prompt-caching-is-now-available-on-the-anthropic-api-for-specific-claude-models/ - 15/08/2024
- Unveiling Hermes 3: The First Full-Parameter Fine-Tuned Llama 3.1 405B Model is on Lambda’s Cloud : https://lambdalabs.com/blog/unveiling-hermes-3-the-first-fine-tuned-llama-3.1-405b-model-is-on-lambdas-cloud - https://nousresearch.com/hermes3/- 15/08/2024
- Meet Black Forest Labs, the startup powering Elon Musk’s unhinged AI image generator : https://techcrunch.com/2024/08/14/meet-black-forest-labs-the-startup-powering-elon-musks-unhinged-ai-image-generator/ - 14/08/2024
- OpenAI reveals an updated GPT-4o model - but can't quite explain how it's better : https://www.zdnet.com/article/openai-reveals-an-updated-gpt-4o-model-but-cant-quite-explain-how-its-better - 14/08/2024
- Artists’ lawsuit against Stability AI and Midjourney gets more punch : https://www.theverge.com/2024/8/13/24219520/stability-midjourney-artist-lawsuit-copyright-trademark-claims-approved - 14/08/2024
- FalconMamba 7B Released: The World’s First Attention-Free AI Model with 5500GT Training Data and 7 Billion Parameters : https://www.marktechpost.com/2024/08/12/falconmamba-7b-released-the-worlds-first-attention-free-ai-model-with-5500gt-training-data-and-7-billion-parameters/ - 12/08/2024
- Perplexity AI: The Game-Changer in Conversational AI and Web Search : https://originality.ai/blog/perplexity-ai-statistics - 11/08/2024
- Black Forest Labs Open-Source FLUX.1: A 12 Billion Parameter Rectified Flow Transformer Capable of Generating Images from Text Descriptions : https://www.marktechpost.com/2024/08/02/black-forest-labs-open-source-flux-1-a-12-billion-parameter-rectified-flow-transformer-capable-of-generating-images-from-text-descriptions/ - 02/08/2024
- Runway just dropped image-to-video in Gen3 — I tried it and it changes everything : https://www.tomsguide.com/ai/ai-image-video/runway-drops-image-to-video-in-gen3-i-tried-it-and-it-changes-everything - 31/07/2024
- Midjourney drops surprise v6.1 update — now humans look more real than ever : https://www.tomsguide.com/ai/ai-image-video/midjourney-drops-surprise-v61-update-now-humans-look-more-real-than-ever - 31/07/2024
- Instagram starts letting people create AI versions of themselves : https://www.theverge.com/24209196/instagram-ai-characters-meta-ai-studio-release - 30/07/2024
- Canva acquires Leonardo.ai to boost its generative AI efforts : https://techcrunch.com/2024/07/29/canva-acquires-leonardo-ai-to-boost-its-generative-ai-efforts/ - 29/07/2024
- Microsoft is adding AI-powered summaries to Bing search results : https://www.engadget.com/microsoft-is-adding-ai-powered-summaries-to-bing-search-results-203053790.html - 25/07/2024
- Large Enough (Mistral Large 2) : https://mistral.ai/news/mistral-large-2407 - 24/07/2024
- Introducing Llama 3.1: Our most capable models to date : https://ai.meta.com/blog/meta-llama-3-1 and https://llama.meta.com - 23/07/2024
- SearchGPT Prototype : https://openai.com/index/searchgpt-prototype - 25/07/2024
- Hugging Face Releases SmolLM, a Series of Small Language Models, Beats Qwen2 and Phi 1.5 : https://analyticsindiamag.com/ai-news-updates/hugging-face-releases-smollm-a-series-of-small-language-models-beats-qwen2-and-phi-1-5/ - 20/07/2024
- Mistral AI and NVIDIA Unveil Mistral NeMo 12B, a Cutting-Edge Enterprise AI Model : https://blogs.nvidia.com/blog/mistral-nvidia-ai-model/ - 18/07/2024
- OpenAI unveils GPT-4o mini, a smaller and cheaper AI model : https://techcrunch.com/2024/07/18/openai-unveils-gpt-4o-mini-a-small-ai-model-powering-chatgpt - 18/07/2024
- Mistral releases Codestral Mamba for faster, longer code generation : https://venturebeat.com/ai/mistral-releases-codestral-mamba-for-faster-longer-code-generation/ - 16/07/2024
- Anthropic releases Claude app for Android : https://techcrunch.com/2024/07/16/anthropic-releases-claude-app-for-android - 16/07/2024
- Exclusive: OpenAI working on new reasoning technology under code name ‘Strawberry’ : https://www.reuters.com/technology/artificial-intelligence/openai-working-new-reasoning-technology-under-code-name-strawberry-2024-07-12/ - 16/07/2024
- Exclusive: meet Haiper 1.5, the new AI video generation model challenging Sora, Runway : https://venturebeat.com/ai/exclusive-meet-haiper-1-5-the-new-ai-video-generation-model-challenging-sora-runway/ - 16/07/2024
- Stable Diffusion 3 License Revamped Amid Blowback, Promising Better Model : https://decrypt.co/238871/stable-diffusion-3-license-revamped-amid-blowback - 08/07/2024
- Elon Musk Reveals Plans To Make World’s “Most Powerful” 100,000 NVIDIA GPU AI Cluster : https://wccftech.com/elon-musk-reveals-plans-to-make-worlds-most-powerful-10000-nvidia-gpu-ai-cluster - 09/07/2024
- Anthropic’s Claude adds a prompt playground to quickly improve your AI apps: https://techcrunch.com/2024/07/09/anthropics-claude-adds-a-prompt-playground-to-quickly-improve-your-ai-apps/ - 09/07/2024
- Odyssey Building 'Hollywood-Grade' AI Text-to-Video Model to Compete With Sora, Gen-3 Alpha : https://www.gadgets360.com/ai/news/odyssey-ai-text-to-video-model-hollywood-grade-report-6067589 - 09/07/2024
- Groq unveils lightning-fast LLM engine; developer base rockets past 280K in 4 months : https://venturebeat.com/ai/groq-releases-blazing-fast-llm-engine-passes-270000-user-mark/ - 08/07/2024
- Mozilla Llamafile, Builders Projects Shine at AI Engineers World’s Fair : https://thenewstack.io/mozilla-llamafile-builders-projects-shine-at-ai-engineers-worlds-fair/?utm_referrer=https%3A%2F%2Fwww.lastweekinai.com%2F - 02/07/2024
- Exclusive: This is Google AI, and it's coming to the Pixel 9 : https://www.androidauthority.com/google-ai-recall-pixel-9-3456399/ - 02/07/2024
- Meta is about to launch its biggest Llama model yet — here’s why it’s a big deal : https://www.tomsguide.com/ai/meta-is-about-to-launch-its-biggest-llama-model-yet-heres-why-its-a-big-deal - 02/07/2024
- Runway’s Gen-3 Alpha AI video model now available – but there’s a catch : https://venturebeat.com/ai/runways-gen-3-alpha-ai-video-model-now-available-but-theres-a-catch/ - 01/07/2024
- Google’s Gemma 2 series launches with not one, but two lightweight model options—a 9B and 27B : https://venturebeat.com/ai/googles-gemma-2-series-launches-with-not-one-but-two-lightweight-model-options-a-9b-and-27b/ - 27/06/2024
- Meet Sohu: The World’s First Transformer Specialized Chip ASIC : https://www.marktechpost.com/2024/06/26/meet-sohu-the-worlds-first-transformer-specialized-chip-asic - 26/06/2024
- Anthropic Debuts Collaboration Tools for Claude AI Assistant : https://www.pymnts.com/artificial-intelligence-2/2024/anthropic-debuts-collaboration-tools-for-claude-ai-assistant - 25/06/2024
- Anthropic just dropped Claude 3.5 Sonnet with better vision and a sense of humor : https://www.tomsguide.com/ai/anthropic-just-dropped-claude-35-sonnet-with-better-vision-and-a-sense-of-humor - 21/06/2024
- Runway unveils new hyper realistic AI video model Gen-3 Alpha, capable of 10-second-long clips : https://venturebeat.com/ai/runway-unveils-new-hyper-realistic-ai-video-model-gen-3-alpha-capable-of-10-second-long-clips/ - 17/06/2024
- ‘We don’t need Sora anymore’: Luma’s new AI video generator Dream Machine slammed with traffic after debut : https://venturebeat.com/ai/we-dont-need-sora-anymore-lumas-new-ai-video-generator-dream-machine-slammed-with-traffic-after-debut - 12/06/2024
- ‘Apple Intelligence’ will automatically choose between on-device and cloud-powered AI : https://www.theverge.com/2024/6/7/24173528/apple-intelligence-ai-features-openai-chatbot - 07/06/2024
- Impressive KLING AI video generator now available internationally : https://the-decoder.com/impressive-kling-ai-video-generator-now-available-internationally - 06/06/2024
- Udio introduces new udio-130 music generation model and more advanced features : https://braintitan.medium.com/udio-introduces-new-udio-130-music-generation-model-and-more-advanced-features-3f08b9909f7b - 30/05/2024
- Perplexity AI's new feature will turn your searches into shareable pages : https://techcrunch.com/2024/05/30/perplexity-ais-new-feature-will-turn-your-searches-into-sharable-pages - 30/05/2024
- PwC agrees deal to become OpenAI’s first reseller and largest enterprise user : https://www.cnbc.com/2024/05/29/pwc-to-become-openais-first-reseller-and-largest-enterprise-user.html - 29/05/2024
- Opera is adding Google's Gemini AI to its browser : https://www.engadget.com/opera-is-adding-googles-gemini-ai-to-its-browser-120013023.html - 28/05/2024
- Google unveils Veo and Imagen 3, its latest AI media creation models: https://www.engadget.com/google-unveils-veo-and-imagen-3-its-latest-ai-media-creation-models-173617373.html - 15/05/2024
- Google is redesigning its search engine — and it’s AI all the way down : https://www.theverge.com/2024/5/14/24155321/google-search-ai-results-page-gemini-overview - 15/05/2024
- OpenAI releases GPT-4o, a faster model that’s free for all ChatGPT users : https://www.theverge.com/2024/5/13/24155493/openai-gpt-4o-launching-free-for-all-chatgpt-users - 14/05/2024
- Stability AI sows gen AI discord with Stable Artisan : https://venturebeat.com/ai/stability-ai-sows-gen-ai-discord-with-stable-artisan/ - 09/05/2024
- New Microsoft AI model may challenge GPT-4 and Google Gemini : https://arstechnica.com/information-technology/2024/05/microsoft-developing-mai-1-language-model-that-may-compete-with-openai-report/ - 07/05/2024
- GitHub releases an AI-powered tool aiming for a 'radically new way of building software' : https://www.zdnet.com/article/github-releases-an-ai-powered-tool-that-is-a-radically-new-way-of-building-software/ - 29/04/2024
- Microsoft launches Phi-3, its smallest AI model yet : https://www.theverge.com/2024/4/23/24137534/microsoft-phi-3-launch-small-ai-language-model - 23/04/2024
- Introducing Phi-3: Redefining what’s possible with SLMs : https://azure.microsoft.com/en-us/blog/introducing-phi-3-redefining-whats-possible-with-slms - 23/04/2024
- LLAMA 3 : https://llama.meta.com/llama3 - 19/04/2024
- Amazon Music’s Maestro lets listeners make AI playlists : https://www.theverge.com/2024/4/16/24132129/amazon-music-maestro-ai-playlist-prompts - 17/04/2024
- TSMC's $65 billion bet still leaves US missing piece of chip puzzle : https://arstechnica.com/gadgets/2024/04/tsmcs-65-billion-bet-still-leaves-us-missing-piece-of-chip-puzzle - 12/04/2024
- Udio AI Music Raises $10 Million, $6.5 Million For Spines AI, More Sora, Cinematic AI : https://www.forbes.com/sites/charliefink/2024/04/11/udio-ai-music-raises-10-million-65-million-for-spines-ai-more-sora-cinematic-ai/?sh=57c270912ed6 - 11/04/2024
- Mistral AI Stuns With Surprise Launch of New Mixtral 8x22B Model : https://analyticsindiamag.com/mistral-ai-stuns-with-surprise-launch-of-new-mixtral-8x22b-model - 10/04/2024
- Google updates its Gemma AI model family with variants for coding and research : https://siliconangle.com/2024/04/10/google-updates-gemma-ai-model-family-variants-coding-research - 10/04/2024
- AI editing tools are coming to all Google Photos users : https://blog.google/products/photos/google-photos-editing-features-availability - 10/04/2024
- Elon Musk says the next-generation Grok 3 model will require 100,000 Nvidia H100 GPUs to train : https://www.tomshardware.com/tech-industry/artificial-intelligence/elon-musk-says-the-next-generation-grok-3-model-will-require-100000-nvidia-h100-gpus-to-train - 10/04/2024
- Groq CEO: 'We No Longer Sell Hardware' : https://www.eetimes.com/groq-ceo-we-no-longer-sell-hardware - 05/04/2024
- Humanoid robots are joining the Mercedes-Benz workforce : https://www.freethink.com/robots-ai/apollo-robots - 30/03/2024
- Introducing DBRX: A New State-of-the-Art Open LLM : https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm - 27/03/2024
- Sora: first impressions : https://openai.com/blog/sora-first-impressions - 25/03/2024
- 16 Changes to the Way Enterprises Are Building and Buying Generative AI : https://a16z.com/generative-ai-enterprise-2024 - 21/03/2024
- The iPhone-maker is in ‘active’ talks to bring Gemini to the iPhone, and has also considered using ChatGPT : https://www.theverge.com/2024/3/18/24104626/apple-license-google-gemini-generative-ai-openai-chatgpt - 18/03/2024
- Stability AI brings a new dimension to video with Stable Video 3D : https://venturebeat.com/ai/stability-ai-brings-a-new-dimension-to-video-with-stable-video-3d - 18/03/2024
- Open Release of Grok-1 : https://x.ai/blog/grok-os - 17/03/2024
- Anthropic releases Claude 3 Haiku, an AI model built for speed and affordability : https://venturebeat.com/ai/anthropic-releases-claude-3-haiku-an-ai-model-built-for-speed-and-affordability - 13/03/2024
- OpenAI's GPT-4.5 Turbo leaked on search engines and could launch in June : https://the-decoder.com/openais-gpt-4-5-turbo-leaked-on-search-engines-and-could-launch-in-june - 12/03/2024
- Introducing Devin, the first AI software engineer : https://www.cognition-labs.com/introducing-devin - 12/03/2024
- Cohere releases powerful 'Command-R' language model for enterprise use : https://venturebeat.com/ai/cohere-releases-powerful-command-r-language-model-for-enterprise-use - 11/03/2024
- Salesforce announces new AI tools for doctors : https://www.cnbc.com/2024/03/07/salesforce-announces-new-ai-tools-for-doctors.html - 07/03/2024
- Inflection-2.5: meet the world's best personal AI : https://inflection.ai/inflection-2-5 - 07/03/2024
- Competition in AI video generation heats up as DeepMind alums unveil Haiper : https://techcrunch.com/2024/03/05/competition-in-ai-video-generation-heats-up-as-deepmind-alums-unveil-haiper - 06/03/2024
- Introducing the next generation of Claude : https://www.anthropic.com/news/claude-3-family - 03/03/2024
- Meta AI creates ahistorical images, like Google Gemini : https://www.axios.com/2024/03/01/meta-ai-google-gemini-black-founding-fathers - 01/03/2024
- It's official: Waymo robotaxis are now free to use freeways and leave San Francisco : https://sfstandard.com/2024/03/01/waymo-san-francisco-cpuc-expansion-approval - 01/03/2024
- Here Come the AI Worms : https://www.wired.com/story/here-come-the-ai-worms - 01/03/2024
- Figure Raises $675M at $2.6B Valuation and Signs Collaboration Agreement with OpenAI : https://www.prnewswire.com/news-releases/figure-raises-675m-at-2-6b-valuation-and-signs-collaboration-agreement-with-openai-302074897.html - 29/02/2024
- Mistral AI releases new model to rival GPT-4 and its own chat assistant : https://techcrunch.com/2024/02/26/mistral-ai-releases-new-model-to-rival-gpt-4-and-its-own-chat-assistant - 27/02/2024
- Stability announces Stable Diffusion 3, a next-gen AI image generator : https://arstechnica.com/information-technology/2024/02/stability-announces-stable-diffusion-3-a-next-gen-ai-image-generator - 23/02/2024
- What two years of AI development can tell us about Sora : https://www.vox.com/future-perfect/24080195/sora-openai-sam-altman-ai-generated-videos-disinformation-midjourney-dalle - 23/02/2024
- Gemma: Introducing new state-of-the-art open models : https://blog.google/technology/developers/gemma-open-models - 21/02/2024
- Introducing LlamaCloud and LlamaParse : https://blog.llamaindex.ai/introducing-llamacloud-and-llamaparse-af8cedf9006b - 21/02/2024
- Generative AI Startup Mistral Releases Free ‘Open-Source’ 7.3B Parameter LLM : https://voicebot.ai/2024/02/19/generative-ai-startup-mistral-releases-free-open-source-7-3b-parameter-llm-2 - 19/02/2024
- FOD#41: GPU's rival? What is Language Processing Unit (LPU) : https://www.turingpost.com/p/fod41 - 19/02/2024
- Our next-generation model: Gemini 1.5 : https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024 (https://deepmind.google/technologies/gemini/#gemini-1.5_ - 15/02/2024
- Creating video from text - Sora : https://openai.com/sora - 15/02/2024
- Amazon AGI Team Say Their AI Is Showing "Emergent Abilities" : https://futurism.com/the-byte/amazon-researchers-ai-emergent - 15/02/2024
- AI Computing Firm Lambda Raises $320 Million in Fresh Funding : https://www.usnews.com/news/technology/articles/2024-02-15/ai-computing-firm-lambda-raises-320-million-in-fresh-funding - 15/02/2024
- Cohere for AI launches open source LLM for 101 languages : https://venturebeat.com/ai/cohere-for-ai-launches-open-source-llm-for-101-languages/ - 13/02/2024
- OpenAI reportedly developing two AI agents to automate entire work processes : https://the-decoder.com/openai-reportedly-developing-two-ai-agents-to-automate-entire-work-processes/?amp=1 - 08/02/2024
- Meet 'Smaug-72B': The new king of open-source AI : https://venturebeat.com/ai/meet-smaug-72b-the-new-king-of-open-source-ai - 06/02/2024
- Apple releases 'MGIE, a revolutionary AI model for instruction-based image editing : https://venturebeat.com/ai/apple-releases-mgie-a-revolutionary-ai-model-for-instruction-based-image-editing - 06/02/2024
- Introducing Qwen1.5 : https://qwenlm.github.io/blog/qwen1.5 - 04/02/2024
- Allen Institute for AI launches open and transparent OLMo large language model : https://siliconangle.com/2024/02/01/allen-institute-ai-fully-open-transparent-olmo-llm-rival-openai-google - 01/02/2024
- Arc Search's AI responses launched as an unfettered experience with no guardrails : https://mashable.com/article/arc-search-browser-app-ai-no-guardrails - 01/02/2024
- YouTube is cracking down on AI-generated true crime deepfakes : https://www.theverge.com/2024/1/8/24030107/youtube-ai-deepfakes-true-crime-victims-minors - 24/01/2024
- Google is using AI to organize and customize your Chrome browser : https://www.theverge.com/2024/1/23/24047843/google-chrome-browser-ai-organize-tabs-themes - 24/01/2024
- Claude 2.1 Multi-Modal Assistance : https://claudeai.uk/claude-2-1-multi-modal-assistance/ - 21/01/2024
- Stability AI unveils smaller, more efficient 1.6B language model as part of ongoing innovation : https://venturebeat.com/ai/stability-ai-unveils-smaller-more-efficient-1-6b-language-model-as-part-of-ongoing-innovation/ - 19/01/2024
- The Rabbit R1 will receive live info from Perplexity’s AI ‘answer engine’ : https://www.theverge.com/2024/1/18/24043490/rabbit-r1-ai-perplexity-pro-live-search-info-answers - 19/01/2024
- Meta begins training Llama 3, reshuffles AI responsibilities : https://www.axios.com/2024/01/18/zuckerberg-meta-llama-3-ai - 18/01/2024
- Samsung’s latest Galaxy phones offer live translation over phone calls, texts : https://techcrunch.com/2024/01/17/samsungs-latest-galaxy-phones-offer-live-translation-over-phone-calls-texts - - 17/01/2024
- Microsoft Copilot is now using the previously-paywalled GPT-4 Turbo, saving you $20 a month : https://www.windowscentral.com/software-apps/microsoft-copilot-is-now-using-the-previously-paywalled-gpt-4-turbo-saving-you-dollar20-a-month - 16/01/2024
- Getty and Nvidia bring generative AI to stock photos : https://www.theverge.com/2024/1/8/24027259/getty-images-nvidia-generative-ai-stock-photos - 09/01/2024
- A list going viral reveals famous artists whose work was used to train AI generator : https://www.nbcnews.com/tech/tech-news/famous-artists-trained-ai-generator-viral-list-rcna131995 - 05/01/2024
- AI-powered search engine Perplexity AI, now valued at $520M, raises $73.6M : https://techcrunch.com/2024/01/04/ai-powered-search-engine-perplexity-ai-now-valued-at-520m-raises-70m - 04/01/2024
- Early Mickey Mouse is now in the public domain—and AI is already on the case : https://arstechnica.com/information-technology/2024/01/early-mickey-mouse-is-now-in-the-public-domain-and-ai-is-already-on-the-case/ - 03/01/2024
[Full listing of latest news](News.md)
## Articles and research papers
- Does your LLM truly unlearn? An embarrassingly simple approach to recover unlearned knowledge : https://arxiv.org/abs/2410.16454 - 21/10/2024
- Diffusion Models Are Real-Time Game Engines : https://arxiv.org/abs/2408.14837v1 - 27/08/2024
- Capabilities of Gemini Models in Medicine : https://arxiv.org/abs/2404.18416 - 29/04/2024
- Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone : https://arxiv.org/abs/2404.14219 - 22/04/2024
- Llama 3 is not very censored : https://ollama.com/blog/llama-3-is-not-very-censored - 19/04/2024
- Introducing Idefics2: A Powerful 8B Vision-Language Model for the community : https://huggingface.co/blog/idefics2 - 15/04/2024
- THE AI INDEX REPORT - Measuring
trends in AI : https://aiindex.stanford.edu/report - Apr 2024
- Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length : https://arxiv.org/abs/2404.08801 - 12/04/2024
- Long-form factuality in large language models : https://arxiv.org/abs/2403.18802 - 27/03/2024
- Data Interpreter: An LLM Agent For Data Science : https://arxiv.org/abs/2402.18679v1 - 12/03/2024
- Stealing Part of a Production Language Model : https://arxiv.org/abs/2403.06634v1 - 11/03/2024
- Chain-of-Thought Reasoning Without Prompting : https://arxiv.org/abs/2402.10200v1 - 15/02/2024
- World Model on Million-Length Video And Language With RingAttention : https://arxiv.org/abs/2402.08268v1 - 13/02/2024
- Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks : https://arxiv.org/abs/2402.04248v1 - 06/02/2024
- The Impact of Reasoning Step Length on Large Language Models : https://arxiv.org/abs/2401.04925v1 - 10/01/2024
- PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models : https://arxiv.org/abs/2401.05252v1 - 10/01/2024
- Mixtral of Experts : https://arxiv.org/abs/2401.04088 - 08/01/2024
- LLAMA PRO: Progressive LLaMA with Block Expansion : LLAMA PRO: Progressive LLaMA with Block Expansion - 04/01/2024
- Some of the Samsung Galaxy S24's key AI features just leaked : https://www.techradar.com/phones/samsung-galaxy-phones/some-of-the-samsung-galaxy-s24s-key-ai-features-just-leaked - 01/01/2024
- Improving Text Embeddings with Large Language Models : https://arxiv.org/abs/2401.00368v1 - 31/12/2023
- Customizing Realistic Human Photos via Stacked ID Embedding : https://arxiv.org/pdf/2312.04461v1.pdf - 07/12/2023
- Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine : https://arxiv.org/abs/2311.16452v1 - 28/11/2023
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones : https://arxiv.org/abs/2312.16862 - 28/12/2023
- White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is? : https://arxiv.org/abs/2311.13110v2 - 22/11/2023
- OpenAI DevDay: Beyond the Headlines with Logan Kilpatrick, OpenAI's Dev Relations Lead : https://www.youtube.com/watch?v=WFM2pvj00oc - 08/11/2023
- Introducing GPTs : https://openai.com/blog/introducing-gpts - 06/11/2023
- Adversarial Attacks and Defenses in Large Language
Models: Old and New Threats : https://arxiv.org/pdf/2310.19737v1.pdf - 23/10/2023
- LoRA Fine-tuning Efficiently Undoes Safety Training
in Llama 2-Chat 70B : https://arxiv.org/pdf/2310.20624.pdf - 23/10/2023
- Managing AI Risks in an Era of Rapid Progress : https://arxiv.org/pdf/2310.17688.pdf - 23/10/2023
- Contrastive Prefence Learning: Learning from Human Feedback without RL : https://arxiv.org/abs/2310.13639v1 - 20/10/2023
- AgentTuning: Enabling Generalized Agent Abilities for LLMs : https://huggingface.co/papers/2310.12823 - 20/10/2023
- Llemma: An Open Language Model For Mathematics : https://arxiv.org/abs/2310.10631v1 - 16/10/2023
- MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning : https://arxiv.org/abs/2310.09478 - 14/10/2023
- LLark: A Multimodal Foundation Model for Music : https://arxiv.org/abs/2310.07160v1 - 11/10/2023
- LongLLMLingua: ACCELERATING AND ENHANCING LLMS IN LONG CONTEXT SCENARIOS VIA PROMPT COMPRESSION : https://arxiv.org/pdf/2310.06839.pdf - 10/10/2023
- HyperAttention: Long-context Attention in Near-Linear Time : https://arxiv.org/abs/2310.05869v2 - 09/10/2023
- TimeGPT-1 : https://arxiv.org/abs/2310.03589v1 - 05/10/2023
- A Long Way to Go: Investigating Length Correlations in RLHF : https://arxiv.org/abs/2310.03716v1 - 05/10/2023
- Understanding the Effects of RLHF on LLM Generalisation and Diversity : https://arxiv.org/abs/2310.06452v1 - 05/10/2023
- Language Modeling Is Compression : https://arxiv.org/pdf/2309.10668.pdf - 19/09/2023
- SyncDreamer: Generating Multiview-consistent Images from a Single-view Image : https://arxiv.org/abs/2309.03453v1 - 07/09/2023
- RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback : https://arxiv.org/abs/2309.00267v1 - 01/09/2023
- Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities : https://arxiv.org/abs/2308.12966v1 - 24/08/2023
- Perception, performance, and detectability of conversational artificial intelligence across 32 university courses : https://www.nature.com/articles/s41598-023-38964-3 - 24/08/2023
- Prompt2Model : https://arxiv.org/pdf/2308.12261v1.pdf - 23/08/2023
- Efficient Benchmarking (of Language Models) : https://arxiv.org/abs/2308.11696v1 - 22/08/2023
- Reinforced Self-Training (ReST) for Language Modeling : https://arxiv.org/pdf/2308.08998.pdf - 21/08/2023
- Graph of Thoughts: Solving Elaborate Problems with Large Language Models : https://arxiv.org/abs/2308.09687v2 - 21/08/2023
- RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models - https://arxiv.org/abs/2308.07922v1 - 15/08/2023
- Self-Alignment with Instruction Backtranslation : https://arxiv.org/abs/2308.06259v2 - 11/08/2023
- BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents : https://arxiv.org/abs/2308.05960v1 - 11/08/2023
- CoTracker: It is Better to Track Together : https://co-tracker.github.io/ - https://arxiv.org/abs/2307.07635 - 07/07/2023
- LIMA: Less Is More for Alignment : https://arxiv.org/abs/2305.11206
- Introduction to Weight Quantization : https://towardsdatascience.com/introduction-to-weight-quantization-2494701b9c0c
- GPTQ: ACCURATE POST-TRAINING QUANTIZATION
FOR GENERATIVE PRE-TRAINED TRANSFORMERS : https://arxiv.org/pdf/2210.17323.pdf - 22/03/2023
- Context-faithful Prompting for Large Language Models : https://arxiv.org/abs/2303.11315 - 20/03/2023
- LLaMA: Open and Efficient Foundation Language Models : https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models - 24/02/2023
- Understanding Large Language Models : A Transformative Reading List : https://sebastianraschka.com/blog/2023/llm-reading-list.html - 07/02/2023
- Unnatural Instructions:
Tuning Language Models with (Almost) No Human Labor : https://arxiv.org/pdf/2212.09689.pdf - 09/12/2022
- Openai Research : https://openai.com/research
- Large Language Models are Zero-Shot Reasoners : https://arxiv.org/abs/2205.11916 - 24/02/2022
- (GPT1) Improving Language Understanding
by Generative Pre-Training : https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf - see also https://openai.com/research/language-unsupervised and https://huggingface.co/docs/transformers/model_doc/openai-gpt - 2018
- (GPT-2) Language Models are Unsupervised Multitask Learners (2019): https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf - see also https://openai.com/research/better-language-models and https://github.com/openai/gpt-2 (source code) - 2019
- (GPT-3) Language models are few-shot learners : https://openai.com/research/language-models-are-few-shot-learners and https://arxiv.org/abs/2005.14165 - May 28, 2020
- Fast Transformer Decoding: One Write-Head is All You Need : https://arxiv.org/abs/1911.02150 - 06/11/2019
- How GPT3 Works - Visualizations and Animations - https://jalammar.github.io/how-gpt3-works-visualizations-animations/
- The Illustrated GPT-2 (Visualizing Transformer Language Models) : https://jalammar.github.io/illustrated-gpt2
- Generalized Language Models : https://lilianweng.github.io/posts/2019-01-31-lm - 31/01/2019
- Language Models are Unsupervised Multitask Learners : https://paperswithcode.com/paper/language-models-are-unsupervised-multitask -> https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf - 2019
- A Visual Guide to Using BERT for the First Time : https://jalammar.github.io/a-visual-guide-to-using-bert-for-the-first-time - 2018
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding : https://arxiv.org/abs/1810.04805 - 11/10/2018
- Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing : https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html - 02/11/2018
- Attention? Attention! : https://lilianweng.github.io/posts/2018-06-24-attention - 24/06/2018
- The Annotated Transformer : http://nlp.seas.harvard.edu/2018/04/03/attention.html - 03/04/2018
- Attention Is All You Need : https://arxiv.org/abs/1706.03762 and https://arxiv.org/pdf/1706.03762.pdf - 12/06/2017
## Resources
### Github and softwares
- bitsandbytes : https://github.com/TimDettmers/bitsandbytes -
https://huggingface.co/blog/4bit-transformers-bitsandbytes - a lightweight wrapper around CUDA custom functions, in particular 8-bit optimizers, matrix multiplication (LLM.int8()), and quantization functions
- BIG-Bench Hard : https://github.com/suzgunmirac/BIG-Bench-Hard
- Bocoel : https://github.com/rentruewang/bocoel
- Braq (Customizable data format for config files, AI prompts, and more) : https://github.com/pyrustic/braq
- Dalai : https://github.com/cocktailpeanut/dalai - Run LLaMA and Alpaca on your computer
- GGML : https://github.com/ggerganov/ggml
- koboldcpp : https://github.com/LostRuins/koboldcpp/wiki
- langchain : https://github.com/langchain-ai/langchain
- LoRA: Low-Rank Adaptation of Large Language Models : https://github.com/microsoft/LoRA
- Llama : https://github.com/facebookresearch/llama
- llama.cpp : https://github.com/ggerganov/llama.cpp
- LMSYS - Fastchat: https://github.com/lm-sys/FastChat -> Vicuna : https://lmsys.org/blog/2023-03-30-vicuna - Cost of training Vicuna-13B is around $300 -> Vicuna Installation Guide : https://github.com/vicuna-tools/vicuna-installation-guide
- LMSYS Chatbot Arena Leaderboard : https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
- Med-Flamingo : https://github.com/snap-stanford/med-flamingo
- MS Engineering Playbook : https://microsoft.github.io/code-with-engineering-playbook/
- Mistral AI : https://github.com/mistralai/mistral-src
- NanoPhi : https://github.com/VatsaDev/NanoPhi
- Ollama : https://github.com/ollama/ollama
- OpenGPTs : https://github.com/langchain-ai/opengpts
- Parameter-Efficient Fine-Tuning (PEFT) methods : https://github.com/huggingface/peft - enable efficient adaptation of pre-trained language models (PLMs) to various downstream applications without fine-tuning all the model's parameters
- QLoRA: Efficient Finetuning of Quantized LLMs : https://github.com/artidoro/qlora
- RedPajama-Data : https://github.com/togethercomputer/RedPajama-Data - An Open Source Recipe to Reproduce LLaMA training dataset
- Stable Diffusion web UI : https://github.com/AUTOMATIC1111/stable-diffusion-webui
- Web LLM : https://github.com/mlc-ai/web-llm
- Self-Operating Computer Framework : https://github.com/OthersideAI/self-operating-computer/tree/main
- Weights & Biases : https://wandb.ai/site - https://github.com/wandb/wandb
- WizardVicunaLM : https://github.com/melodysdreamj/WizardVicunaLM - Wizard's dataset + ChatGPT's conversation extension + Vicuna's tuning method
- BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains : https://huggingface.co/BioMistral/BioMistral-7B
- Cohere : https://cohere.com/
- Claude : https://claude.ai/
- Groq (LPU) : https://groq.com/
- Huggingface : https://huggingface.co : https://huggingface.co/models
- Ideogram : https://ideogram.ai/
- LlamaIndex : https://www.llamaindex.ai/
- Lumiere : https://lumiere-video.github.io/
- Midjourney : https://www.midjourney.com
- Mistral AI : https://mistral.ai/
- Modular - mojo : https://www.modular.com/mojo
- OpenAI : https://openai.com
- Pika : https://pika.art/
- Stable diffusion : https://stability.ai/stablediffusion
- Large World Model (LWM) : https://github.com/LargeWorldModel/LWM
- Udio : https://www.udio.com
### Learning/Education
- Andrej Karpathy - Let's build the GPT Tokenizer : https://www.youtube.com/watch?v=zduSFxRajkE - 21/02/2024
- Applied Deep Learning : https://github.com/maziarraissi/Applied-Deep-Learning
- Attention is all you need; Attentional Neural Network Models : https://www.youtube.com/watch?v=rBCqOTEfxvg - Oct 2017
- Awesome transformers : https://github.com/cedrickchee/awesome-transformer-nlp
- CyientifiQ Innovation League : https://cyient.hackerearth.com/ - 16/09/2023 - 23/10/2023
- Data Engineering Wiki : https://dataengineering.wiki/Learning+Resources
- Deep Learning Short Courses : https://www.deeplearning.ai/short-courses/
- Geometric Deep Learning : https://geometricdeeplearning.com/blogs/
- How GPT3 Works - Easily Explained with Animations : https://www.youtube.com/watch?v=MQnJZuBGmSQ - Aug 2020 - Basic overview
- Hugging Face NLP Course : https://huggingface.co/learn/nlp-course/chapter1/1
- Interfaces for Explaining Transformer Language Models: https://jalammar.github.io/explaining-transformers - 2020
- Introduction to Large Lange Models : https://docs.cohere.com/docs/introduction-to-large-language-models
- Lightning AI : https://lightning.ai/lightning-ai/studios/optimized-llm-inference-api-for-mistral-7b-using-vllm - https://lightning.ai/lightning-ai/studios/run-mistral-moe-mixture-of-experts
- LLM Bootcamp - Spring 2023 : https://fullstackdeeplearning.com/llm-bootcamp/spring-2023/
- LLM University! : https://docs.cohere.com/docs/llmu
- Machine Learning Developer Notes : https://9600.dev/posts/machine-learning-developer-notes/
- Machine learning mastery : https://machinelearningmastery.com/start-here/
- Machine Learning for Beginners - A Curriculum : https://github.com/microsoft/ML-For-Beginners
- Made With ML : https://madewithml.com/
- A Visual Guide to Mamba and State Space Models : https://maartengrootendorst.substack.com/p/a-visual-guide-to-mamba-and-state
- Mistral AI Cookbook : https://github.com/mistralai/cookbook
- Natural Language Processing : https://www.coursera.org/specializations/natural-language-processing
- Neuroscience 4 ML : https://neuro4ml.github.io/ - https://github.com/neuro4ml/exercises
- Ollama : https://ollama.ai/blog
- Patterns, Predictions and Actions - A story about Machine Learning: https://mlstory.org/
- Prompt Engineering Guide: https://www.promptingguide.ai
- Python for Data Analyst : https://wesmckinney.com/book/
- Practical Deep Learning : https://course.fast.ai/Lessons/lesson1.html
- Quantization: https://huggingface.co/blog/merve/quantization
- Scientific Python Lectures : https://lectures.scientific-python.org/
- The Illustrated Transformer : https://jalammar.github.io/illustrated-transformer/
- The little book of deep learning : https://fleuret.org/public/lbdl.pdf
- The Narrated Transformer Language Model : https://www.youtube.com/watch?v=-QH8fRhqFHM - Aug 2020 - Details on architecture
- Transformers from Scratch : https://www.kaggle.com/code/auxeno/transformers-from-scratch
- The transformer model, explained clearly : https://www.deriveit.org/notes/119
- Pandas cheat sheet : https://github.com/pandas-dev/pandas/blob/main/doc/cheatsheet/Pandas_Cheat_Sheet.pdf
### Chats
- ChapGPT : https://chat.openai.com
- Claude : https://claude.ai/
- Cohere : https://coral.cohere.com/
- Groq (LPU) : https://groq.com/
- HuggingFace : https://huggingface.co/chat/
- Julius AI : https://julius.ai/
- Libertai : https://chat.libertai.io/#/
- Meta AI : https://www.meta.ai/
- Mistral : https://chat.mistral.ai/
### Discord and related github
- Dalai: https://discord.gg/WWfgrzzkCT and https://github.com/cocktailpeanut/dalai - Run LLaMA and Alpaca on your computer
- FastChat (LMSys) : https://discord.gg/HSWAKCrnFx and https://github.com/lm-sys/FastChat and https://chat.lmsys.org - FastChat is an open platform for training, serving, and evaluating large language model based chatbots
- Olama : https://discord.com/invite/ollama and https://ollama.com
- Text generation web UI: https://discord.gg/jwZCF2dPQN and https://github.com/oobabooga/text-generation-webui
- Tom Jobbins (TheBloke) : https://discord.gg/theblokeai and https://github.com/TheBloke