To use a simple example.
Say you have 2 digital cameras, identical in every way except one is 3 megapixel and the other 6 megapixel.
If you take the same photo with both cameras you will find the 3 megapixel camera produces an image which is at a lower resolution, it is less pixels wide and less pixels high.
You can take the 3 megapixel image and stretch it to the same resolution as the 6 megapixel image but in doing so the stretching routine has to insert pixels to increase the width and height. It does this by guessing what colour each inserted pixel should be. There are a number of clever ways to do this and each produces slightly different results.
If you compare the stretched 3 megapixel image with the 6 megapixel one you will find that it doesn't look quite as good, doesn't look quite as real as the 6 megapixel one.
This is because the 3 megapixel image started at a lower quality, it contained less information about the real scene. The process of stretching it to the same resolution/size as the 6 megapixel image added pixels, but these pixels are guesses and not actual information about the real scene, not quality.
The same is true for each frame of a video. Yes, you can stretch a frame to the size of HD video but all you're doing is adding pixels, not quality.