There is no one answer as there are a mind boggling number of different encoding styles. But firstly, higher frequency radio waves are not inherently more power hungry, it's all to do with the amplifier.
Now, first some terminology:
Modulation - the way in which data is encoded (has nothing to do with the medium, be it radio, cable, fiber etc.)
Symbol - term for the smallest unique piece of data being transmitted.
Carrier - the average frequency used during transmission, also serves to provide a reference phase (i.e. zero degrees phase shift) for modulation schemes that change the phase of the signal.
Now, most (but not all) of the modulation schemes used in radio don't just use zero and one as this only gets you 1 bit of data per symbol. So instead either multiple frequencies, phases or amplitudes are used to get more spectral 'bang for your buck'. See, the bandwidth needed to transmit a signal is not just related to the data rate it is also dependent on the modulation scheme and the signal to noise ratio.
Simple PSK (phase shift keying - it shifts between two phases) a very common and simple modulation can fit one bit per Hz of bandwidth. Straight PKk can also be detected at very low signal levels and in very noisy channels.
There is a more sophisticated version called 4-PSK which can transmit 2 bits per Hz of bandwidth because it has four unique symbols but cannot be detected quite as easily and has slightly less noise tolerance.
QAM is one of the most widely used radio data transmission modulations. In QAM basically two streams are transmitted, one with the same phase as the carrier, one shifted by 90 degrees. Because these two signals are at 90' to each other (or orthogonal, or 'in quadrature') they can be transmitted at the same frequency and can be separated at the receiving end without really interfering with each other. A QAM system uses changes in the amplitude of these two signals to transmit data. Imaging a grid of 4x4 points, this represents the different amplitude combinations for the signals. 4 unique and equally spaced amplitudes going left and right (-3, -1. +1, +3) and top to bottom (the two signals were at 90' remember?) This gives 16 possible combinations and as 2^4 = 16, this means we can encode 4 unique bits of data for each position on the grid (or symbol). The catch is that because we have several different amplitude levels, our signal is more sensitive to noise, there is a finite limit to how much data you can send down and medium (air, fiber, copper, string etc.) and this is related to the available bandwidth and the signal to noise ratio and is called the Shannon-Hartley Limit. This limit is equal to $$ Capacity = available\ bandwidth * log_2(1+Signal\ power/noise\ power) $$
Now 256-QAM modulation is one of the modulations used in 4G LTE (among others) which can push 8 bits per Hz of bandwidth (log2(256) = 8) but because you now have 256 unique symbols, it's ability to be received at low levels and in noise channels is quite a bit worse than straight PSK (I think the SNR handling ability is about 30dB worse or something - 30dB being 1000x less noise power). So we can move 8 times more data with the same bandwidth so we can service we can service 8 times the number of people from the same channel, or give the same number of people 8 times the speed provided we have a nice clean, noise free channel.
This pales in comparison to the modulations used in ADSl internet which has the capacity to go all the way up to 32'768-QAM or a colossal 15 bits per symbol provided that the line is very good. (FM radio uses a special kind of QAM-like modulation where it's the frequency rather than the amplitude that's modulated and the In-phase or "I" signal carries one side and the out-of-phase signal or "Q" carries the other side giving full stereo sound, but when there's a weak signal, the I and Q channels can be summed together to give a meaningful but mono output - kind of a unique feature among modulation schemes)
The modulations used in fiber optics have much in common with QAM radio modulations (hint: The Shannon-Hartley limit still applies)
Digital data going down computer buses still tends to be just straight zeros and ones as it's easy to encode and decode digitally, adding more complex symbols would require some analog circuitry and it can be challenging to make good analog and digital circuits on the same chip and have them perform well at GHz speeds. Although having said that, Ethernet uses a 5-level symbol to send 2.5 bits per Hz across 4 twisted cables so even for a 1Gbit link, each cable needs only ~110MHz of bandwidth, which means they can use much cheaper cables than if each cable had to carry 250MHz of binary encoded data (110Mega-symbols/sec * 4 channels * 2.3 bits/symbol ~= 1Gbit/s - it's 4 channels because data travels both ways down each twisted pair simultaneously)
Now in SSDs as you correctly surmised, a "1" takes more effort to write as you have to force electrons into the tiny bit cells, whereas a "0" is the default state. However, once a "1" has been written, it is even harder to change it back to a "0" again as you somehow now have to suck those pesky electrons out of the insulating layer where they're stored and there is always at least one pesky electron who doesn't want to come back out, so each time you write a "1", that "1" will become just a bit more permanent, hence the limited number of read/write cycles of Flash memory. Now there's several kinds of Flash memory, SLC, MLC, TLC and now even QLC. Each stores progressively more bits per cell at the expense of lifespan but comes at a reduced cost - everything is a trade-off. SLC is commonly quoted at 100'000 R/W cycles, MLC somewhere around 10'000, TLC at 1000 and QLC (you can see where this is going...) QLC comes in at ~100 R/W cycles. The reason is that the number of voltage levels stored in each cell is proportional to 2^bits/cell so SLC Flash stores either a "1" or a "0", while MLC stores four different levels to represent all 2bit combinations (00, 01, 10, 11), TLC stores 8 levels and QLC stores 16 levels. More levels means smaller gaps between them so it takes less to upset the reading circuitry.
So in answer to your question, there are a whole bucket load of different ways of encoding data in modern systems, some ways favour high bandwidth efficiency at the expense of complex encoding and decoding while others favour simplicity usually at the expense of bandwidth and transmission distance. As for ways to differentiate signals, Amplitude, phase and frequency are be far the most common.