spencebeecher's comments

spencebeecher · on Feb 23, 2017

Yep! You are right. I don't think it is a huge disparity but I would like to implement the pivotal/empirical bootstrap instead. The change is just a few lines of code.

spencebeecher · on Feb 23, 2017

Completely agree it is easy. Doing it quickly (for Python) is what this is optimized for. We would love a contribution if you have a method for resampling + percentiles that beats numpy.

nurettin · on Feb 23, 2017

I will surely give it a look if I have a need for this. Until then, thanks for the contribution and the invitation, and I have nothing against numpy.

spencebeecher · on Feb 23, 2017

Thanks for the feedback - happy hacking =)

spencebeecher · on Feb 23, 2017

John, you are a true wizard. I admire you & will work to incorporate your feedback (gathered offline) into the library =)

Thanks for the feedback!

spencebeecher · on Feb 23, 2017

I agree with you! Pandas is only used in the power analysis code (which also has matplotlib for plotting). The best thing would be to pair this down. We would gladly take contributions - i think the path forward on this feedback is clear but will take a little code =)

Most of the important stuff is just numpy which i feel is pretty fair for most peeps.

spencebeecher · on Feb 23, 2017

Excellent q - we intend to add in other options as we go. Pivotal being one. Id also like to add in permutation tests. If you have ideas we welcome diffs =)

malayandi · on Feb 23, 2017

Would love to contribute. I'll try to put some work in this weekend when I have the time.

spencebeecher · on Feb 23, 2017

spencebeecher · on Feb 22, 2017

Thanks for the feedback!

numpy is used to give a speed improvement when generating the bootstrap samples - this would be very slow in a Python for loop.

Pandas is only used in the power analysis code. Ill make that more clear.

Would love more feedback if you have it!

spencebeecher · on Feb 22, 2017

That uses the BCa method which in some situations is better.

This library gives you a/b test functionality and should be faster on large input datasets.

spencebeecher · on Feb 22, 2017

Thanks for the feedback Petters! I agree in principle. I am familiar with that method. The use case for this is for situations where you have large initial sample counts (so the correction should be less important, we do throw warnings when the initial sample counts are low). We also provide tools to check power (I'll commit an example of this later today).

Also - I gladly accept diffs if you are motivated. It is not clear to me that BCa and other variants provide substantial improvement for most practical situations. I would invite criticism here.

Tldr - thanks for the feedback