An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

less than 1 minute read

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

- Vision Transformer(ViT) Method

기존 자연어 처리에서 사용되던 Transformer라는 모델을 Vision에 적용
Image Patch -> Transformer Encoder

Share on

X Facebook LinkedIn Bluesky

Long Context vs. RAG for LLMs

1 minute read

Long Context vs. RAG for LLMs 논문 요약

Speed Always Wins: A Survey on Efficient Architectures for LLMs

1 minute read

논문 개요 이 논문은 대형 언어 모델(LLM)의 효율적인 아키텍처 설계에 초점을 맞추어, 처리 속도와 비용, 자원 효율 및 실제 응용 환경에서의 실질적 성능에 대해 체계적으로 분석한다. 기존 트랜스포머 기반 모델의 한계를 넘어서는 다양한 혁신적 설계 및 최근 연구 트렌드를 폭넓게 ...

A Survey on LLM-as-a-Judge

less than 1 minute read

LLM-as-a-Judge에 대한 종합 조사

Efficient Memory Management for Large Language Model Serving with PagedAttention

1 minute read

본 논문은 대형 언어 모델(LLM) 서빙 환경에서 가장 큰 병목 중 하나인 메모리 관리 문제를 해결하기 위해 PagedAttention이라는 혁신적인 방법을 제안한다. 이 기법은 특히 KV 캐시(Key-Value Cache) 메모리 사용 최적화에 초점을 맞추며, 운영체제의 가상 메...

Hong Yong Man

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

- Vision Transformer(ViT) Method

Share on

You May Also Enjoy

Long Context vs. RAG for LLMs

Speed Always Wins: A Survey on Efficient Architectures for LLMs

A Survey on LLM-as-a-Judge

Efficient Memory Management for Large Language Model Serving with PagedAttention