| | Optimizing LLMs from a Dataset Perspective (sebastianraschka.com) |
| 138 points by alexmolas on Sept 15, 2023 | past | 24 comments |
|
| | PyTorch: Cross-Entropy vs. Negative Log Likelihood (sebastianraschka.com) |
| 2 points by auraham on Sept 12, 2023 | past |
|
| | Training and aligning LLMs with RLHF and RLHF alternatives (sebastianraschka.com) |
| 102 points by rasbt on Sept 10, 2023 | past | 14 comments |
|
| | Understanding Llama 2 and the New Code Llama LLMs (sebastianraschka.com) |
| 170 points by rasbt on Aug 30, 2023 | past | 34 comments |
|
| | Llama 2, CodeLlama, and GPT-4 performance: recent LLM developments and research (sebastianraschka.com) |
| 1 point by rasbt on Aug 27, 2023 | past |
|
| | AI Research Highlights in 3 Sentences or Less (July-August 2023) (sebastianraschka.com) |
| 1 point by rasbt on Aug 12, 2023 | past |
|
| | Does it beat LLMs? NN+Gzip method reimplemented and explained step-by-step (sebastianraschka.com) |
| 3 points by rasbt on July 30, 2023 | past |
|
| | State of Computer Vision 2023 (sebastianraschka.com) |
| 1 point by eugenOrl on July 24, 2023 | past |
|
| | AI and DL paper highlights June-July 2023 (sebastianraschka.com) |
| 1 point by rasbt on July 15, 2023 | past |
|
| | State of Computer Vision 2023 (sebastianraschka.com) |
| 2 points by rasbt on July 6, 2023 | past |
|
| | Accelerating PyTorch Model Training 10x (With Mixed-Precision and FSDP) (sebastianraschka.com) |
| 2 points by rasbt on June 26, 2023 | past |
|
| | Understanding Encoder and Decoder LLMs (sebastianraschka.com) |
| 5 points by rasbt on June 17, 2023 | past |
|
| | AI Research Highlights in 3 Sentences or Less (May-June 2023) (sebastianraschka.com) |
| 2 points by rasbt on June 10, 2023 | past |
|
| | Recapping recent LLM research concerning tuning strategies and data efficiency (sebastianraschka.com) |
| 2 points by rasbt on June 3, 2023 | past |
|
| | Finetuning LLMs Efficiently with Adapters (sebastianraschka.com) |
| 1 point by Anon84 on May 27, 2023 | past |
|
| | Why the original transformer figure is wrong, and some other tidbits about LLMs (sebastianraschka.com) |
| 237 points by rasbt on May 24, 2023 | past | 49 comments |
|
| | AI Research Highlights in 3 Sentences or Less (April-May 2023) (sebastianraschka.com) |
| 1 point by rasbt on May 15, 2023 | past |
|
| | Accelerating Large Language Models with Mixed-Precision Techniques (sebastianraschka.com) |
| 1 point by tim_sw on May 12, 2023 | past |
|
| | Parameter-Efficient LLM Finetuning with Low-Rank Adaptation (LoRA) (sebastianraschka.com) |
| 2 points by tim_sw on April 27, 2023 | past |
|
| | Finetuning Large Language Models (sebastianraschka.com) |
| 223 points by headalgorithm on April 22, 2023 | past | 70 comments |
|
| | Understanding large language models: A cross-section of the relevant literature (sebastianraschka.com) |
| 307 points by headalgorithm on April 16, 2023 | past | 31 comments |
|
| | Understanding Parameter-Efficient Finetuning of Large Language Models (sebastianraschka.com) |
| 2 points by rasbt on April 13, 2023 | past |
|
| | Large Language Models 3.0 (sebastianraschka.com) |
| 3 points by headalgorithm on April 5, 2023 | past |
|
| | Keeping Up with AI Research and News (sebastianraschka.com) |
| 1 point by rasbt on March 23, 2023 | past |
|
| | Latest research on reinforcement learning w human feedback for language models (sebastianraschka.com) |
| 1 point by rasbt on March 7, 2023 | past |
|
| | Show HN: Some Techniques to Make Your PyTorch Models Train Faster (sebastianraschka.com) |
| 1 point by rasbt on Feb 23, 2023 | past |
|
| | Understanding Large Language Models – A Transformative Reading List (sebastianraschka.com) |
| 81 points by mariuz on Feb 11, 2023 | past | 16 comments |
|
| | Understanding Large Language Models – A Transformative Reading List (sebastianraschka.com) |
| 2 points by mellosouls on Feb 11, 2023 | past |
|
| | Understanding and coding the self-attention mechanism of large language models (sebastianraschka.com) |
| 158 points by mariuz on Feb 10, 2023 | past | 37 comments |
|
| | Understanding the Self-Attention Mechanism of Large Language Models from Scratch (sebastianraschka.com) |
| 2 points by rasbt on Feb 9, 2023 | past |
|
|
| More |