YouTubeDrive

Started by rafiazafar, Sep 19, 2022, 09:11 AM

Previous topic - Next topic

rafiazafarTopic starter

Unlimited size hosting? It sounds like something fantastic and impossible according to the laws of nature. About like a perpetual motion machine. But what if this is possible? (not a perpetual motion machine, of course, but unlimited hosting).
If you think about it, dozens of Internet services provide unlimited free storage, from social networks to photo web hosting. For instance, YouTube's limit on the size of one video is 12 hours or 256 GB, but there are no restrictions on the number of videos.
Theoretically, several 256 GB videos are enough to hide an encrypted copy of your SSD in them.



The only obstacle is that they check file formats when downloading to avoid abuse of the service. That is, theoretically, it is enough to bypass this check — and the problem is solved. But how to get around it?

The option to change the file extension immediately comes to mind. But today, this option does not work on almost any web  hosting, because they check the metadata and the contents of the files.
It's too easy to distinguish a media file from a random set of bits.

But what if we use steganography methods — and embed the payload directly into the native media file? And indeed, there are ready-made tools for solving this problem. Here are some of them.

YouTubeDrive

YouTubeDrive is a package for Mathematica that recodes any files into RGB pixels. That is, any data stream is simply transcoded into a video stream — and then back. The program was written five years ago purely to test the concept that such a thing is possible (PoC).

The intermediate video format is created for the sole purpose — for video hosting.

That is , the scheme is as follows:

FILE ARCHIVE → VIDEO → YOUTUBE → and back FILE ARCHIVE

YouTubeDrive is written in Wolfram Language (Mathematica language). It encodes/decodes arbitrary data, and videos are automatically uploaded and uploaded from YouTube.

Since YouTube does not impose any restrictions on the total number of videos that users can upload, this provides virtually unlimited hosting. However, such web hosting is extremely slow, because the encoding/decoding procedure takes time first on a local PC, and then on YouTube.

Here is the package code. Of course, the Mathematica program itself must be installed for it to work.

YouTubeDrive uses the standard FFmpeg, youtube-upload and youtube-dl utilities. These programs must be downloaded and installed separately, and before the first use in the YouTubeDrive.wl package (lines 75-77), you need to specify their installation locations:

FFmpegExecutablePath = "FFMPEG_PATH_HERE";
YouTubeUploadExecutablePath = "YOUTUBE-UPLOAD_PATH_HERE";
YouTubeDLExecutablePath = "YOUTUBE-DL_PATH_HERE";


For instance, in this way:

75 | FFmpegExecutablePath = "C:\\ffmpeg.exe";
76 | YouTubeUploadExecutablePath = Sequence["python",
77 | "C:\\Users\\j0ker\\AppData\\Local\\Programs\\" <>
78 | "Python\\Python35\\Scripts\\youtube-upload.py"];
79 | YouTubeDLExecutablePath = "C:\\youtube-dl.exe";


After installing these programs and changing the configuration files, you can open the YouTubeDrive package itself.wl in the Mathematica program (via File ⇨ Install... ⇨ Package ⇨ From File ⇨ YouTubeDrive.wl).


Before using youtube-upload, we upload a test video to our account, log in to YouTube — and get an OAuth token. You will need this authentication token to upload videos via YouTubeDrive.

The algorithm of YouTubeDrive is quite clear. There are two main functions in the package:

    YouTubeUpload function[bytearr] encodes the bytearr data array as a simple RGB video, uploads it to YouTube and returns the video ID.

    By default, 64×36 RGB pixels are encoded, which are scaled into 2304 squares of 20×20 pixels of a 1280×720 frame. RGB values are only 0 or 255, that is, three bits per square are obtained. RGB values are written to files .PNG files that are glued together using the FFmpeg codec into a valid MP4 video that can survive further transcoding on the YouTube web  server.

In reality, the YouTube transcoder can randomly swap the values of individual pixels 0 and 255, so large squares of 20x20 act as an error correction mechanism.

The second function YouTubeRetrieve returns a YouTube video with the specified videoid ID, decodes it and returns the received data to the video of the ByteArray object.

On the YouTube server itself, processing a small video about 10 MB in size takes about 5-10 minutes. Thus, immediately after the completion of YouTubeUpload[bytearr] the file is not available for download yet, you need to wait a bit. Only then can you launch the YouTubeRetrieve call.
If you are interested, you can download the video and watch it, you can even decrypt it manually. Instead of youtube-dl, a more advanced fork of yt-dlp is now usually used to download videos from YouTube.

Of course, such an encoding scheme is not very effective — about 1 minute of video per 1 MB of information. This parameter can be easily increased by an order of magnitude (up to 10 MB per minute of video) if some normal error correction algorithm is implemented instead of 20×20 squares.

YouTubeDrive is a purely experimental program. The author emphasizes that in no case does he support its use on an industrial scale. As we have already said, the program was written five years ago purely for proof of concept.

YouBit

The idea of a file storage on YouTube looks very attractive. Two years ago, the fvid program appeared, which encodes files without the need to install Mathematica. Of the dependencies, it only needs FFmpeg and libmagic.

In continuation of that topic, the YouBit utility has recently been released. So to speak, the latest achievement of progress in this area.

Compared to its predecessors, YouBit has been significantly improved. We can say that that is a new level in terms of coding quality.

Firstly, here the author has abandoned the use of color subdiscretization, and encodes all information only in the brightness channel. The result is black-and-white videos, but encoding and decoding become noticeably easier — and are performed faster.

In the brightness channel, pixels take a value from 0 to 255. Accordingly, if we encode one bit in a pixel (parameter bpp: 1, then all values from 128 to 255 are perceived as one, and smaller values are perceived as 0.
For instance, if we encode two bits in one pixel (00, 01, 10 or 11), then the entire brightness range is divided into four regions (0-63, 64-127, 128-191, 192-255) and so on. The more bits per pixel, the smaller the file size and the higher the probability of incorrect decryption of a particular bit after transcoding the video on YouTube web servers.

Secondly, the author came up with an original way to optimize videos specifically for a specific YouTube encoder in order to increase the reliability of their subsequent decoding. However, now for successful decoding of files it is already necessary to upload them to YouTube first, and then download the result.

Thirdly, YouBit works with videos of any resolution. Optimally 1920×1080.

Finally (fourthly), the author implemented the "zero frames" function — inserting black frames between frames with a payload, which reduces the bitrate by about 40% (and the file size) when encoding videos on the YouTube server.

Using:

python -m youbit upload C:/myfile.txt chrome

The chrome argument specifies which browser YouBit should extract cookies from to authenticate to YouTube (accordingly, that browser must log in to studio.youtube.com /, open a session and get a cookie).

Then the download takes place directly using the headless Selenium browser in the background. A very competent approach.

Downloading and reverse decoding videos:

python -m youbit download https://youtu.be/dQw4w9WgXcQ

Decoding a local video downloaded from YouTube:

python -m youbit decode C:/myvideo.mp4


All operations can be performed via the Python API.
As we have already said, YouTube limits on the size of a video clip — 12 hours or 256 GB. If the account is not verified, the limit is 15 minutes.

The default setting of YouBit for the size of a single file is 9 GB, but it can be adjusted if desired.
Unlimited hosting in other places
Unlimited file storage is provided by other services, for instance:

    Telegram (2 GB file size limit, 4 GB in the paid version)
Social networks (any files can be stored under the guise of videos and photos), usually hosting unlimited size there
    Habrastorage


To work with these "non-traditional web hosting", you can write your own client / file manager — a small program for downloading files via YouBit, so as not to run it manually.

Ideally, I would like to add the following functions there:

    Support for different hosting services (different social networks, photo and video hosting) so that YouBit copies videos with our files not only to YouTube, but also to other web hosting services that provide unlimited disk space for free.
    Backup. Since each web site is considered temporary, excessive duplication should be provided.
    Proper synchronization: updating new versions or new files without rewriting the entire archive.

But that is ideal. In the minimal version, a simple script will suffice, which in batch mode simply encodes a couple of hundred files into MP4 format via YouBit — and uploads the video file to several video hosting web sites.

As for storing files on YouTube (and in other services of large corporations) — the essence boils down to using the infrastructure not exactly as it is supposed to, but at the same time completely legally.
I repeat, if a user disguises his personal files as videos and uploads them to public web hosting, he does not violate the laws of any country in any way. In any case, that is not a commercial program, but just a theoretical experiment, and it's up to you to repeat it or not.
  •  

Bubunt

I think the author is quite able to afford to pay for cloud storage of the size he needs, and he created all these tools solely from the point of view of academic interest.
In my opinion, this is quite an interesting task, to understand the ways of encoding information and optimize the algorithm for specific conditions of real life (in this case, the YouTube algorithm, but it could be anything).

Sharing convenient tools for such abuses may not be very good, of course, but the author is clearly not a pioneer in this area, if someone needs and is in demand, such tools would appear sooner or later, and for sure something like this already exists in a dozen options.
  •  

allricjohnson1

Yes, if somebody significantly abused it. Even if we assume that everyone who has read the topic will begin to merge all their data in that way into YouTube, then an ordinary streamer who streams 4k60 for 3-5 hours 2-5 times a week is likely to surpass all these keepers in terms of data volume.

In addition, I have seen many times the statement that traffic costs YouTube much more than storage.
And what kind of traffic will such keepers have? Yes, about zero, since they obviously won't pump that data back and forth very often. Yes, even if we assume that they will pump out 20 times a day - it's still nothing compared to the number of views from streamers.

Plus, there is a chance that YouTube stores all uploaded videos in the original or almost in the original, and only for viewing it gives out a mess. This assumption is based on the fact that some 9-year-old videos recoded in av1 have better quality than in other codecs, and in order to do this, you need to have the originals. So the storage volumes are not critical at all, and no matter how hard you try, you still won't be able to influence YouTube in any way.
  •  

manivel

For instance, I have seen many species that have better quality in vp9 than in h264, which means that in vp9 they still squeezed from the sources, well, or quality close to them.
It's the same now with av1. There are videos from 15 years ago, squeezed in av1 with better quality than in vp9.

But of course there are few such videos, mostly these are exceptions, because basically vp9 and av1 have a lower bitrate than h264 and due to the lower bitrate and the quality is worse.
For example, such size ratios are quite common: h264 - 100 MB, vp9 - 75 MB, av1 - 60 MB. Of course, in this case vp9 is very likely worse than h264, and av1 is worse than vp9.
And when the ratios, for example, are as follows: h264 -100 MB, vp9 - 85 MB, av1 - 76 MB, then the situation is reversed.
  •