GSoC: File downloading week

According to my proposal’s schedule, I should’ve been working on Storage interface, token saving and Dropbox and OneDrive stubs this week. I’ve tried to schedule less work before the midterm because I’ve got some exams to pass. Still, I managed to do most of that first week plan with my preparation work. So, I’ve finished that by Tuesday and started working on the things planned for the second week: files downloading.

I’ve also implemented DownloadRequest class that Tuesday. It just reads bytes from NetworkReadStream and writes them into DumpFile.

There was some small problem with DumpFile: it fopen’s a file (creating it, if it doesn’t exist), but it assumes that the file is located in some existing directory. When you’re downloading a file from the cloud into your local folder, you might not have the same directory hierarchy there. Thus, fopen fails and no download occurs. As there was no createDirectory() method in ScummVM, I had to implement one. So, now DumpFile accepts a bool parameter, which indicates whether all directories should be created before fopen’ing the file.

Then we’ve decided I should «upgrade» our Requests/ConnectionManager system by adding RequestInfo struct and giving each Request an id. User could use an id to locate the Request through ConnMan and ask it to change its state. That’s mostly needed when we want to make the same request again: for example, if some error occurred, we probably would like to try again in a few seconds. This system added RETRY state for Requests, so user’s code can easily ask to retry Request. The original idea with RequestInfo and ids was rethought, so now we’re actually using pointers to Request instances. That way one can easily affect Request: cancel, pause or restart it with its methods.

With that system I could easily implement the OneDriveTokenRefresher. That might be not the best class name, but it describes it quite well =) OneDrive access tokens expire after an hour, so that’s very sad to get an error because you’re working with the app more than an hour. This class wraps the usual CurlJsonRequest (which is used to receive JSON representation of server’s response) to hide errors regarding expired token. Basically, it just peeks into received JSON and, if there is an error message, refreshes the token and then tries to do the original request again, using the new token. With that new system, it just pauses the original request and then retries it — no need to save the original request’s parameters somewhere because they are just stored along with the paused request.

After I finished that useful token refreshing class, I’ve implemented OneDriveStorage::download(), thus completing the second week plan too. The plan also has «auto detection» feature, which is being delayed a little because it would be called from the GUI and I’m not working on the GUI yet. Yet, there was no folder downloading in my plan, and it’s obviously a must have feature, so that’s what I did next. The FolderDownloadRequest uses Storage’s listDirectory() and download() methods, so it’s easy to implement and it works with all Storage implementations which have these methods working.

I actually implemented OneDriveStorage’s listDirectory() a little bit later because there is no way to list directories recursively in OneDrive’s API. There is such opportunity in Dropbox, so it takes me one API call to list whole directory and all its subdirectories contents. To list OneDrive directories recursively, I had to write a special OneDriveListDirectoryRequest class, which lists the directory with one API call, then lists each of its subdirectories with other calls, then their subdirectories, and so on until it lists it all.

I’ve been asked to draw some sequence diagrams explaining all these Requests/ConnectionManager systems, so that’s what I’m going to do now. I’m not a fan of sequence diagrams and I’m going to draw them in my own style, which, I believe, makes them simpler.

UPD: here they come, two small diagrams (available on ScummVM wiki too):

File sync comes next week and I guess I’m going to discuss that feature a lot before I actually start implementing it.

GSoC: Work starts tomorrow

(At least, that’s «tomorrow» in my timezone.)

So, the last time I wrote about doing some preparation work before GSoC starts and I believe I did quite good.

JSON parser works fine, even though we have to replace all unicode characters into ’?’ to make it work.

Now I’m checking that my code compiles not only with MSVC, but also with gcc through MinGW shell. Thus, I have tested the configure changes sev made, and libcurl detection and usage is successful.

There is a small «framework» for libcurl already: I’ve added ConnMan (ConnectionManager, similarly to ConfMan), which could be used to start curl requests. There is a special CurlJsonRequest, which reads the whole response, parses as JSON and then passes the result to its caller. Finally, I’ve commited Callbacks yesterday. I still keep those in a separate branch, because some edits might be necessary, but I believe it to be a very good thing to have.

These Callback classes are not just plain pointer to function stuff. It’s «object-oriented callbacks», meaning it’s actually «pointer to an object and its method». Plus, I also made it so one can specify an argument type. Yes, those are templates. I actually remembered, that I already did such thing a few years back while I was at school. «Functor» might be a wrong name to call it, but the idea is still the same: there is a base class, which doesn’t know anything about the class, whose method we’re going to point to, and there is a derived class, which implements it. We’re using pointers to the base class, so we don’t care whether it’s Callback<A> or Callback<B> or SomeOtherCallbackImplementation. Thus, we don’t have to write all these ugly global function callbacks, cast the only void * parameter to some struct and use fields of this struct to do the work. We just write Callback<ClassName> and pass ClassName::method, and so this method would be automatically called when this callback’s operator() is called. If we want to specify, that ClassName::method accepts not void *, but, for example, AnotherClass, we just write Callback<ClassName, AnotherClass> (note that it accepts not a pointer, but an object itself). It’s as simple as that!

I also wanted to do something about Storages and save corresponding access tokens and such into configuration file. Well, right now it remembers the access token of the «current» storage, and thus I’m not authenticating every time I launch ScummVM. Yet, I’d like to remember tokens of all connected storages, so user can easily switch between those.

Finally, there was a «writing API skeleton» in my plan. Well, this whole cloud system works as I thought it would. There are no real API method implementations, but apart from that it looks fine.

OK, so I’d have to start working next week. The general plan is to design API and implement Dropbox and some other provider support. My proposal schedule states the following:

May 23 — May 29
Write Storage interface, implement tokens saving.
Add Dropbox and OneDrive storages.

May 30 — June 5
Implement file downloading (Dropbox and OneDrive).
Add auto detection procedure running after the download.

Storage interface is more or less there, token is saved. «Add» there doesn’t imply storage would be completely functional, so adding Dropbox storage is also done.

So, the actual plan is to upgrade Storage to have some config saving related methods and make tokens saving feature support multiple Storages. After that, I guess I’d add OneDrive stub and start implementing some API methods in Dropbox and OneDrive backends.

I’m still having university studies here and exams are getting close. This means sometimes I’d have to study instead of work, and I might end up behind the schedule, not ahead of it. And that means I’m going to have no weekends closer to the midterm and after it =)

GSoC: Cloud Integration Preparation Work

After discussing API design, I’ve done some research on Dropbox, Google Drive and OneDrive API and discussed further steps with Peter and Eugene. I’ve identified some preparation steps I should do before GSoC starts. Doing this preparation work would also get me familiar with ScummVM code, coding & formatting conventions and commit guidelines.

I’m working on it in my public fork of scummvm repo on Github right in the master branch. Eugene helped me with configure (which I’m not using yet, as I’ve started with MSVC here) and watches how I’m doing.

My preparation work plan includes the following tasks:

  • integrating JSON parser;
  • writing API skeleton;
  • adding libcurl and writing some simple wrapper for it;
  • adding some cloud-related configs keys, so ScummVM would remember my access token and I won’t have to authenticate every time I launch it.

 

We’ve decided that SimpleJSON is a good library to use as JSON parser in ScummVM. It’s really quite simple library, which consists of two classes and uses C++ standard library. It is now available as Common::JSON in my fork and uses ScummVM classes.

A few weeks back, I still was thinking that all cloud-related work would be done within separate thread spawned in main(). But when I’ve decided to actually add such thread (so my JSON examples could work and not delay ScummVM’s launcher), I understood that the original idea to use libcurl’s blocking functions is a bad idea. libcurl’s functions block thread execution until the request is complete. Because of that, our cloud thread won’t react to the user’s commands when it’s «busy». And that means if you started downloading 2 GB on 56 kbps, you can’t cancel it without killing ScummVM process!

That’s totally not what we want. There is no direct thread spawning in ScummVM and even though TimerManager actually runs callbacks in separate threads, such threads might work even when ScummVM main thread is finished (and TimerManager could be implemented so it won’t be running callbacks in separate threads anymore). So, I had to think of something. At first my idea was to add a callback through TimerManager, which would be executed every second. It would’ve checked whether the user wanted ScummVM to make some API requests and whether these requests were complete. But most of the time we don’t work with cloud, and thus such callback would be called even though it was unnecessary.

Now the idea is a bit more complex. We would have to use async libcurl functions, because we don’t want our callback to block. So, we can’t make our API methods to return the requested information, but instead those should receive some function pointer and call the passed function when this information is ready. I also believe that there could be more than one request pending, and some methods would require more than one REST API request.

Thus, I think of making special Request class, which would act as a base class for all methods implementations. For example, ListDirectoryRequest would be not only requesting the first «page» of directory contents, but make such requests until whole directory is listed. When all the information is gathered, it would call the callback.

Storage class, which would actually represent cloud storage API in ScummVM, has all the API methods we need. But these methods would be creating new Request objects only. The implementation would be hidden within these Request classes. So, Storage class would also contain some Request list. When it’s empty, there is nothing to do for Storage, and it does nothing. But when it is not, Storage would start a timer and poll these requests, whether they have complete what they were doing. When the request is complete, it is removed from this list. When list gets empty, Storage stops the timer.

Finally, there is also a CloudManager. This is a class, which would be loading all connected storages from ScummVM configs and providing fast access to the currently active Storage. So, the idea is to have all the storages loaded and ready, so user can easily switch between them in the Options menu. But only one Storage would be «active». This storage would be used to sync saves, upload and download files until user changes that. CloudManager would not only have a getter for it, but also a few «shortcut» methods, which would be using current storage within. For example, you won’t have to do cloudManager->getCurrentStorage()->syncSaves(). Instead, you can just do cloudManager->syncSaves().

I have not implemented Requests yet, but there are already some simple CloudManager and Storage stubs.

UPD: I’ve implemented first simple Request today. It’s not doing anything useful, but it shows that requests can «work» for a few handler() calls and then stop. When no requests are working, timer is automatically stopped.

Why do we need timer in the first place? Well, that’s because we want to know whether request is complete and we have to poll curl_multi_info_read to do that. Plus, this allows us to react to user’s commands. For example, user might want to cancel the download, and then Request would be stopped on the next timer tick.