여정민의 블로그

  • 분류 전체보기
    • MySQL
    • Test
    • Algorithm
      • Coding Interviews
    • 도서
    • Tech Blog
    • Distributed System
    • Kubernetes
    • System Design
      • General
      • Real World
      • Machine Learning
      • LLM-Based
      • Backend
    • Concurrent Programming
    • Programming Language
      • Java
      • Kotlin
      • Python
    • Soft Skills
    • 회고
    • Domain Driven Design
    • Apache Kafka
    • Spring
      • Spring Cloud Stream
      • Spring Boot
      • Spring AI
      • Spring Cloud
    • Machine Learning
    • Backend
    • Generative AI
      • RAG
      • Prompt Engineering
      • Agent
      • Data
      • Vector Database
      • LLM
      • Post-training
      • Python
    • Conclusion
      • Generative AI
      • Computer Science
      • System Design
    • AI tools
    • Data Engineering
    • Deep Learning
    • Elasticsearch
    • 업무일지
    • Apache Airflow
    • Design
      • Figma
  • 홈
  • 태그
  • 방명록
  • github
  • 커리어리
  • 호기심 탐구
/ /

Generative AI/Data

  • LLM Twin 프로젝트로 설명하는 데이터 수집 파이프라인 2025.02.06
  • 파인튜닝을 위한 데이터 합성 방법 정리 2025.02.04
  • NVIDIA: Curating Trillion-Token Datasets: Introducing NVIDIA NeMo Data Curator 2025.01.26
  • NVIDIA: Synthetic Data Generation 2025.01.25
  • What Makes Good Data For Alignment? A Comprehensive Study of Automatic Data Selection In Instruction Tuning 2025.01.25
  • Alpagasus: Traning A better Alpaca with Fewer Data 2025.01.23
  • Code Less, Align More: Efficient LLM Fine-tuning for Code Generationwith Data Pruning 2025.01.21
  • ShareGPT4V: Improving Large Multi-Modal Models with Better Captions 2025.01.20
  • Enhancing Chat Language Models by Scaling High-quality Instructional Conversations 2025.01.20
  • GENIE: Achieving Human Parity In Content-Grounded Datasets Generation 2025.01.19
PREV 이전 12 NEXT 다음

+ Recent posts

Powered by Tistory, Designed by wallel
Rss Feed and Twitter, Facebook, Youtube, Google+

티스토리툴바