Hacker Newsnew | past | comments | ask | show | jobs | submit | mihaich's commentslogin

Location: Bucharest / Romania

Remote: Yes (Worldwide)

Willing to relocate: No

Technologies: C# and .NET, SQL server, Javascript, Python, LLM & GenAI

Résumé/CV: https://drive.google.com/file/d/1Vjd6qrLAVpRuxtlzqhai-ke1DHI...

Email: please see the resume

Linkedin: https://www.linkedin.com/in/mihai-chirculescu-a65223b7

Github: https://github.com/Mihaiii


I made a simple framework for LLM sampling algorithms that can discard generated tokens.

This means it gives you the ability to set rules by which the last tokens are considered incorrect and need to be regenerated.

I have included 2 demo algorithms.

It offers support for both GGUF models (llama.cpp) and models in Huggingface format (Transformers library).

Enjoy!


This makes sense, thank you very much for the feedback!

Another demo idea I had is to have an input field where the user can enter a GitHub username, retrieve all the starred repositories, and enable semantic search on the titles and descriptions of those repositories.

The main idea is that users will typically enter their own usernames, and therefore, they are familiar with the repositories they have starred, which provides a better search experience when testing the component.

Let me know if something else would work better for you.


Not the OP, but personally I'd prefer a search on some realistic examples, like maybe the React docs (https://react.dev/learn) or Next docs (https://nextjs.org/docs) that I frequently struggle to search, especially for complex questions like "what kind of caching does Next.js do" or "what is the proper way to do an async clientside fetch".

--------

Some other feedback (just so I don't make a bunch of separate posts):

- Demo #2 404s

- It would be nice to have some way to highlight or summarize the relevant parts of search results, especially when they're "semantically" searched and each result is several dozen words. There's no easy way to skim the results, and it's not really clear to me (as a user) why the rankings are the way they are. It just looks like a bunch of reordered paragraphs that I still have to read all of.

- 20 MB is a LOT to ask a client to download just to run a search bar. Is there any way to run this as a serverside function / serverless?


Thanks for the feedback! I'll find time to make it better and retry the submission. Now I know it's actually possible to get on the first page :)

Regarding your last point: yes, 20MB is a lot, but the whole point of it is to have it all on client side, within a single component you install. You can already achieve the functionality you mention with the standard MUI's autocomplete. That being said, I'll look into ways to use smaller models.


I wasn't expecting it, but I actually got some votes <3, so here is a better description:

This is a React component for searching/sorting by meaning (not by "characters included in a string", like standard search).

It uses a small ML model that runs on client side (inside the component!). When I say small, I mean ~20MB. The model will be downloaded only once (first time) and afterwards imported from browser's cache.

You can use this component to search and filter by meaning a dropdown list or an external list (like paragraphs of a webpage). You can search with sentences on sentences, not just with small words/substrings.

Here is a demo: https://mihaiii.github.io/semantic-autocomplete/

I believe this is super useful in the real world! :)

Let me know if you have any questions or feedback!

Thank you!


Very cool idea! Congrats!


Thank you!!


That's an interesting observation and it applies to me too. I just asked ChatGPT to summarize it.


This is an unbelievable good video. I just created an account here to say "thank you for sharing it!".


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: