Hacker Newsnew | past | comments | ask | show | jobs | submit | spencebeecher's commentslogin

Yep! You are right. I don't think it is a huge disparity but I would like to implement the pivotal/empirical bootstrap instead. The change is just a few lines of code.


Completely agree it is easy. Doing it quickly (for Python) is what this is optimized for. We would love a contribution if you have a method for resampling + percentiles that beats numpy.


I will surely give it a look if I have a need for this. Until then, thanks for the contribution and the invitation, and I have nothing against numpy.


Thanks for the feedback - happy hacking =)


John, you are a true wizard. I admire you & will work to incorporate your feedback (gathered offline) into the library =)

Thanks for the feedback!


I agree with you! Pandas is only used in the power analysis code (which also has matplotlib for plotting). The best thing would be to pair this down. We would gladly take contributions - i think the path forward on this feedback is clear but will take a little code =)

Most of the important stuff is just numpy which i feel is pretty fair for most peeps.


Excellent q - we intend to add in other options as we go. Pivotal being one. Id also like to add in permutation tests. If you have ideas we welcome diffs =)


Would love to contribute. I'll try to put some work in this weekend when I have the time.


<3


Thanks for the feedback!

numpy is used to give a speed improvement when generating the bootstrap samples - this would be very slow in a Python for loop.

Pandas is only used in the power analysis code. Ill make that more clear.

Would love more feedback if you have it!


That uses the BCa method which in some situations is better.

This library gives you a/b test functionality and should be faster on large input datasets.


Thanks for the feedback Petters! I agree in principle. I am familiar with that method. The use case for this is for situations where you have large initial sample counts (so the correction should be less important, we do throw warnings when the initial sample counts are low). We also provide tools to check power (I'll commit an example of this later today).

Also - I gladly accept diffs if you are motivated. It is not clear to me that BCa and other variants provide substantial improvement for most practical situations. I would invite criticism here.

Tldr - thanks for the feedback


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: