Yes. We use Modal (https://modal.com/), and are big fans of them. They are very ergonomic for development, and allow us to request GPU instances on demand. Currently, we are running our real-time model on A100s.
I see you are paying $2/h. Shoot me an email at victor ta borg.games if your model would fit on RTX 3090 24G to get it down to $0.2/h (fellow startup).