Hacker Newsnew | past | comments | ask | show | jobs | submit | more rustdeveloper's commentslogin

Do we know how LLMs available in OpenLLM and other open source LLMs compare to different versions of GPT models? I know there’s a leaderboard on huggingface: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb... but it doesn’t contain GPT models.



The GPT4/ChatGPT ones are in the source visible: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb...


Does it give access to the internet? Could it be used with this setup: https://scrapingfish.com/blog/byo-mobile-proxy-for-web-scrap... ?


> Does it give access to the internet?

Yes, but via the IP address of whatever machine you're running it on.

> Could it be used with this setup: https://scrapingfish.com/blog/byo-mobile-proxy-for-web-scrap... ?

Not usefully - that tool requires access to an IP address pool of a mobile network, which this project wouldn't give you.


For $20 you can get over 10k requests from scraping fish: https://scrapingfish.com/buy


I don’t see how any LLM would help me with a high quality proxy, which is what I actually need in web scraping and I’m using https://scrapingfish.com/ for this.


There are tutorials for building you own mobile proxy pool so it’s very accessible these days: https://scrapingfish.com/blog/byo-mobile-proxy-for-web-scrap...


I'll be working on my own mobile proxy build: https://scrapingfish.com/blog/byo-mobile-proxy-for-web-scrap...


I initially built a system for web scraping but was constantly running into issues of getting blocked, even when using good quality residential proxies. I had to constantly investigate why I'm getting blocked and update tools. Sometimes the effort was significant when I had to switch to a different framework which was giving me a better success rate.

Then, I switched to web scraping API (I'm using https://scrapingfish.com as they have convenient pricing for my use case, but there are other alternatives). Now I only have to maintain parsing logic in scrapers. It also actually reduced my costs of scraping since I no longer pay for proxies which are more expensive for my scale than a web scraping API.


This looks really cool! There was a tutorial posted on HN about building mobile proxy pool with RPI that had obvious limitations: https://scrapingfish.com/blog/byo-mobile-proxy-for-web-scrap... It seems this could be a solution to scale capabilities of a single RPI.


For web scraping, I recommend using a web scraping API, e.g. https://scrapingfish.com. This solves all potential problems with getting blocked and can make data extraction easier as well.

For the app, I've recently started using Remix (https://remix.run) and so far it seems to have been a good choice for me. There is a good integration with Remix in Mantine for front end: https://mantine.dev/guides/remix/. I think it's a good full stack choice if you just want to quickly build an app for your project/product.


I use Scraping Fish API: https://scrapingfish.com/


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: