Alkemet News
Real-time LLM Inference on Standard GPUs: 3k tokens/s per request
(blog.kog.ai)
107
points
byNicoConstant
5 hours ago |
54
comments
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date
Invalid date