What a great coincidence. I found gpt-2-simple on Friday and just got it running in a Flask on Fargate a few mins ago. GPT-2-simple made the process so simple that my biggest problems were infra and not inference.
Have you heard of any success on running in a lambda?
GPT-2 small might be too big/slow for a lambda (admittingly I am less familiar with the AWS stack, more familiar with GCP). In the meantime, I do have it running on Cloud Run (https://github.com/minimaxir/gpt-2-cloud-run) with decent success.