Those aren't really the CFLAGS, but rather the configuration options passed to the compiler's configuration script. The multilib list covers the most common 3 options for supporting the SH4 that GCC has: always single-precision (-m4-single-only), no floating-point support (-m4-nofpu), and full FPU support (-m4).
KallistiOS makes the assumption throughout the code that you will leave the FPU in single-precision floating point mode, hence why the default is -m4-single-only. Some of the more useful operations that the FPU can do (like matrix * vector multiplication, and vector dot products) are only available as single-precision operations. Basically, if there's no reason you need to be doing double-precision arithmetic (and usually there's not unless you're doing some sort of scientific calculations), then this is a sane state to stay in. As for the warning on printf("%f", 0.0f), that's because printf("%f", ...) assumes that the argument passed is a double -- cast your argument properly and you won't get that warning. That said, that warning is awfully silly, since the C standard says exactly what should happen in that case. Yes, it does only show up with -m4-single-only, but I blame that on GCC doing something silly.
-m4-nofpu is passed to the configure script for programs in the past that have used that mode (namely Dan Potter's dcload replacement known as Slinkie). There's generally not any particularly good reason for using this mode, unless you really don't want the extra baggage of the floating point operations hanging around (i.e, you're really trying to optimize something that doesn't use floating point at all for size). You could safely leave this one out and not have to worry about anything.
-m4 gives you full double-precision support, but as I mentioned before KOS assumes you're always in single-precision mode, so using this might not be the best idea with KOS programs. The SH4's FPU has two different modes, controlled by a certain register to determine whether to do single-precision or double-precision math. KOS always sets the register at boot up to do single-precision and assumes it will never be touched after that. Certain things that you're likely to do in KOS (like matrix/vector stuff, as mentioned earlier) don't work at all in double-precision mode, and pretty much all floating point math will take longer to do in double precision mode. Simply put, there's really not much of a reason for supporting double precision in KOS, so this mode usually isn't used in KOS. If I'm not mistaken, dcload compiles with -m4, hence why we include it still.
TL;DR: -m4 and -m4-nofpu are included for the benefit of programs that used them in the past (and still do), but you should generally stick with -m4-single-only.