Quite a lot of research has been done on human vision, and most people can certainly tell the difference between 30-60FPS. It's a little difficult to nail down an 'exact' number because our vision doesn't correlate perfectly to discrete frame / time values; we have an effect called persistence of vision to deal with there, not to mention an awful lot of signal processing done by the brain to fill in the gaps, and the tendency of the eye to require different stimulus to reach
critical fusion frequency in different lighting conditions (faster stimulus in bright light).
Here's a simple 60-30-15hz test you can look at.
https://www.testufo.com/
Follow the UFO and you'll probably notice how 'blurry' the 30hz sample is compared to the 60hz sample. Your brain is receiving half the motion information, after all.
Many people can pretty easily tell the difference between 60hz and 120hz displays, but it gets kind of fuzzy around the 144-240hz range. Contemporary VR headsets, for instance, use at a minimum 80-90-120hz displays. Anything below that and you generally end up with awful visual artifacting and nausea, further evidence that the human visual system ingests information above 60hz. If you want a life-like EVF you probably need around 120hz before serious diminishing returns are felt.
For image capture, well, the larger range the better, right? If you have a 1,000 FPS x 100MP camera system the real problems are just signal-to-noise for the photographer to sift through after the fact and storage capacity for the images in the first place. Those are more logistical issues than biological ones.