I disagree wrt Mitchell; too dry and out-of-date. In its place I'd recommend the Norvig book which covers everything Mitchell does (IIRC) except 'Probably Approximately Correct' (PAC) learning. And, it's more up-to-date (Support Vector Machines) and is much more readable. Also, I haven't read it except a flip-through at a bookstore, but Christopher Bishop's one I'd look at (perhaps the Amazon comments).
I've seen the Norvig book recommended by others on this site for ML, but I don't understand why: the Norvig book is an AI book, not an ML book. It only has a few chapters on learning, and IIRC much of that is on reinforcement learning. Obviously the Norvig book is very well-written and is good background for learning about ML, but I don't think it is a sufficient ML book as such.
As far as ML goes, I found the "Programming Collective Intelligence" book to very readable and practical, but very light on the theoretical foundations (which is intentional, of course). I've got a copy of the Witten ML book ("Data Mining: Practical Machine Learning Tools and Techniques"), but to be honest I haven't gotten much from it yet, either: it doesn't seem to discuss SVMs in any detail, nor random forests or neural networks. But I haven't really dug into it yet.
This is kind of moot, truthfully I'd go, now (or look first at, but they're both getting praise so...) for Chris Bishop or Ethem Alpaydin's new books. But between Norvig and Mitchell, Norvig has 138 pages on learning vs Mitchell's ~390, but, Mitchell's is from a different era, Norvig is easier to read - larger pages, more diagrams, better writing, and you know where you are better - and fresher material. To each his own. But like I say, I'd probably go with one of the even newer ones. For a while Mitchell was all there was, then Norvig came along, and now there're a few to choose from.
If you can read and understand Mitchell's book, you will have a very good foundation for understanding modern ML techniques. The poster was looking for references to introduce him to the field.
Programming Collective Intelligence by Toby Segaran for a practical approach in Python.