Submissions from github.com/t8

		Add 500M tokens of context space to any LLM with <300ms latency (github.com/t8)
		3 points by tatef 6 days ago \| past \| 1 comment
		Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon (github.com/t8)
		221 points by tatef 12 days ago \| past \| 86 comments