I've seen a few suspect benchmarks for recent announcements of LLM releases. I'm sure they made an attempt at an honest benchmark, but until there's an independent assessment and benchmark (preferably multiple) you have to assume that there's bias in anyone's self published benchmarks like this.
I'm guessing it's a fine-tuning of some existing LLM model or API, but this largely seems to be an "agent" and UI that includes some SWE like workflow coding to allow more complex requests to be asked than just an LLM could provide.
There are promising methods developing for Physic's informed neural networks. Mathematical models can be integrated into the architecture of neural networks such that the parameters of the designed mathematical models can be learned. Examples include learning the frequency of a swinging pendulum from video, amongst more advanced ideas.
Do what you want with some of those tips, maybe not necessarily good advice. But if you're searching, try these search queries replacing `%s` with your desired role(s):
Most of those ATS systems are super annoying. I actually give preference to employers using more modern systems that have up to date user experiences, but regardless, it doesn't hurt to scrape and search all the big ATS players.
The paper linked above does directly address the case of multiple experiments occurring in the same context. They address this with hill-climbing over those 180 different variations. The use of a bayesian linear regression takes place of the exploration found with Thompson sampling.
You're right, the paper linked above is a different way of solving the same problem. In their case they use a model to decide which website variants to show. Their model accounts for independent effects and pairwise dependencies. Evolution allows you to optimize without needing an explicit model.
I don't think they account for potentially changing conversion rates over time or delayed conversions.
Aside from that, I'd be curious to see how these two approaches compare in a real-life situation.
They have some hard comparisons in some of their earlier blog posts on how spacy compares to the other popular open source NLP libraries. In my experience it has been much easier to use and faster than things like Stanford's library or NLTK. In general it's aimed at production or commercial use, whereas the other libraries I typically hear mentioned are aimed at a more academic audience.
Have you used paredit-mode in emacs with a lisp dialect? Getting proficient with this mode can be a lot like what you describe. Paredit let's you navigate and edit the tree structure of lisp code pretty effectively. It's not inherently a modal paradigm, but I used it evil-mode successfully. I'd imagine what you describe could be a refinement on that technique. Lisp lends itself well to this type of editing due to it's lack of syntax. Other languages are more difficult.
Artemis Health | Product/Data/Engineering Roles | Salt Lake City, UT | ONSITE, Full Time | https://artemishealth.com
We build analytics and visualization tools for self-insured companies. We're funded, have had great growth, and we've got several open positions that we're hiring for including:
- Data Pipeline Manager
- Data Quality Analyst
- Data Pipeline Engineer
- ETL Engineer
- Software Engineer in Testing
- Frontend Engineer
Our frontend and API stack is built with Angular, Django, and Django Rest Framework, and some Rust. We're using MySQL and Redshift for the operational and analytics databases respectively.
With our backend and pipeline we use some traditional ETL tooling (Pentaho, Kettle) and have started building out the more complex aspects of our pipeline with the JVM and Kotlin, in addtion to some various python scripts. Again using MySQL and Redshift databases.