IMO, descriptions, lists and schedules are sufficient.
It is about estimating the amount of work.
This doesn't require any pics so far.
IMO, you even need more detailed descriptions within the lists.

For instance:
1x player model, it never fills more than a quarter of the screen, about xx polygons, only 1 skin, texture size of only 256x256, no normalmap, rigged(needed for ragdoll physics), animated, 3 animations(sitting in the car, jumping when winning, whining and falling to ground when loosing)--- things like that... only this simple but necessary asset will need more than 50 hours, because it has to be more than good, it has to be magnificent, it has to be worked over and over again to wow the gamers, to fit with the design of the other game assets and the needs of collision and whatever.

It is about whether the game needs 20 graphical assets (IMO the absolutely minimum), 200 or 2000.

It is about calculating the needed effort to finish the game.
(Because only a finished game is a game.)