I think it can be a useful tool for automation of very standardized ML tasks. However:
It's a command line tool that is also intended for non-technical folks. I sense a contradiction.
That doesn't even speak to the requirement of understanding all these ML algorithms so I can specify them in the config file, or understanding YAML format, or data curation. At this point it would be easier to write the python code - especially scikit-learn which is a very well-documented library.
Hi, I want to clear up some points. First, it is not intended for non technical folks, this was never claimed! However, even if it was, we are currently working on a gui, where (non technical)users can run it by writing a simple cmd in the terminal.
Second, I'm a technical user, in fact this is my daily work and we build this tool for reasons that were mentioned in the docs/readme, so you can check it out.
Third, you mentioned understanding YAML Format. Really? I mean yaml is the most understandable format any person can understand. I can never imagine that a person cannot learn yaml in 30 min at most.
Finally, yes sklearn is great and well documented but did you checked how many libraries are out there that represent basically a wrapper to make it easier/abstracter to write sklearn code? you ll be surprised.
As discussed in the official repo & docs, it is a much cleaner approach to gather your preprocessing & model definition parameters/configs in one human readable file/place, where you can manipulate it easily. Re-run experiments, generate drafts, building proof of concepts as fast as possible, than to write code. At the end of the day, we all have different opinions, you can still write code of course. The tools are there to help.
The README says "The goal of the project is to provide machine learning for everyone, both technical and non technical users"; that definitely sounds as though it's intended for non-technical users.
Yes, so? Your saying that it's not intended for non technical users still contradicts what is said in README. Yes, the README implies that it's not _exclusively_ intended for non technical users. But it implies that the tool is intended for non technical users.
I am only going off on the README, as the other user pointed out, which addresses technical and non-technical people.
So yes, this tool can have great utility. It adds an abstraction layer and removes busywork for repetitive programming tasks. However, the utility will be for users acquainted with command line. Users who know what a config file is, or data types, lists, and key-value relationships assumed by the YAML spec. Users will also have to know the different algorithms so they can populate the config. All of these things require technical knowledge.
All of the above things are what us technical users take for granted, so a claim to cater to non-technical users must be evaluated from their perspective.
I am not belittling your work - this is a good project, but currently targeting an audience too broad.
It's a command line tool that is also intended for non-technical folks. I sense a contradiction.
That doesn't even speak to the requirement of understanding all these ML algorithms so I can specify them in the config file, or understanding YAML format, or data curation. At this point it would be easier to write the python code - especially scikit-learn which is a very well-documented library.