Wedge

Public area => The Pub => Plugins => Topic started by: Arantor on October 13th, 2011, 06:19 PM

Title: Plugin servers / getting plugins to a system
Post by: Arantor on October 13th, 2011, 06:19 PM
So, this is one of the things I've been thinking about lately, is how to make plugin servers work, on a technical level.

This is going to be something of a stream of consciousness rather than a focused discussion, to start with. It's more about trying to make sense of all my thoughts, and if you have anything to add, please do.

Let's look at what SMF has (not what it actually uses). It's all in the Packages > Download Packages page.

Up front that's stupid. You click a page called Download, so that you can upload something. It's almost as backward as the logic of clicking a Start Button to shut down, though at least they have the argument of 'starting the shutdown process'.

Idiot question of the day... should uploading plugins from your computer be a separate page to downloading from a plugin server? Gut instinct says yes, it should, because they're totally different options and processes.

That also leads in nicely to something Dragooon mentioned, about plugins having an 'app storefront' kind of deal. I've mentioned before that I'd like to be able to support the environment where creators can publish paid plugins, and still make use of something like the package server facility, currently not possible in SMF. (Incidentally I was thinking about that almost 18 months ago, if not further back)

Hmm. Seems to me that the actual process of getting a list of plugins is at least as dependent on what interface to use to display, as anything else. SMF provides the ability to accept a list of plugins, and/or a list of categories, wherein you can browse categories. Thing is, that most people don't realise, is that you can also segment the package server list by category so you only need serve a list of what categories there are, then what's in each one - as opposed to the behemoth served by SMF currently that actually breaks some installs with low memory limits.

Then again, that's when you have a public list. If the list is potentially private or semi private, or in any way filterable, it needs to be accommodated by the server, not the client, which means that just serving a packages.xml type file isn't an answer. Hmm. I'm sensing that you'd make a request on a given page much you do with feeds, and make it essentially similar to a RESTful style implementation, that the URL http://example.com/forum/index.php?action=plugins would provide a list of plugins available for download, and adding ;cat=x would filter the list by the specified category. Of course, that would be a detail provided by the server.

I'm a little wary of providing something that amounts to a storefront structure but I'm OK with the architecture providing lists of what's available and so on, especially as for anything that's premium, you'd have to be a member of their site anyway and have access to the relevant downloads.

Hmm. Should a plugin be able to list the plugin server or servers where it can be found?

Also, what exactly is the process for plugins requesting updates? I'm sensing it would be a daily process job, where the system goes to all the plugin servers it knows, with whatever credentials it has, and supplies a list of plugins. (Active plugins, all plugins?) The server itself should be doing the work on filtering down plugins, not the client. Though, if the plugin itself indicates a server or servers where it can be found, presumably only those need to be queried?

One last thing. Do we really need to support both .zip and .tar.gz? Gut instinct says no, just zip will cover the bulk of what's actually needed? (even if it does provide a shade of inconvenience for those who use *nix / OS X that write plugins... which is going to be a limited enough group anyway?)
Title: Re: Plugin servers / getting plugins to a system
Post by: live627 on October 13th, 2011, 07:52 PM
Quote
Do we really need to support both .zip and .tar.gz? Gut instinct says no, just zip will cover the bulk of what's actually needed?
I use the gzipped tars because SMF randomly has problems with ZIPs.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 13th, 2011, 09:14 PM
I'm thinking ZipArchive ;)
Title: Re: Plugin servers / getting plugins to a system
Post by: live627 on October 13th, 2011, 09:18 PM
Remember, it's not built in to PHP/5.2.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 13th, 2011, 09:20 PM
Yeah, that's something that concerned me. That said, there is a PEAR library equivalent that can be safely used if needed.

What about the rest of my stream of quasi-nonsense? :P
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 13th, 2011, 11:26 PM
Quote from live627 on October 13th, 2011, 07:52 PM
Quote
Do we really need to support both .zip and .tar.gz? Gut instinct says no, just zip will cover the bulk of what's actually needed?
I use the gzipped tars because SMF randomly has problems with ZIPs.
How so? read_tgz_file/read_zip_file has always worked for me...
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 13th, 2011, 11:48 PM
There were issues with it concerning 64 bit builds, specifically there was a bug in the way it tried to load zips made on a 64 bit build where phantom offsets were introduced and used.

I was of the understanding that they'd been fixed though.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 14th, 2011, 01:24 AM
OK, let's recap.

.zip vs .tar.gz.
There are more rugged solutions than read_zip_data() out there, one of which is ZipArchive. It's not the only solution, however, and it won't be that hard to find such a solution.

My theory of using only one file type in general just means that it simplifies the process and the code, and minimises the overall number of hassles people have, IMHO.

Should uploading plugins from your computer be a separate page to downloading from a plugin server?
I'm thinking that getting from a plugin server would be a 'Find' page, and uploading from your computer would be an 'Upload' page, as Plugins > Find and Plugins > Upload.

Should a plugin be able to list the plugin server or servers where it can be found?
This one is totally dependent on the expected prevalence of plugin servers. There are worryingly few for SMF, which has pretty much always amounted to people running their own for large projects. I think it's still in single digits.

If there's a (relatively easy) way to specify a plugin server it would help matters, but the more I think about it, the more I think a plugin shouldn't specify its home as a target for lookups, not directly. (As in: we can't expect plugin authors to expressly put in the Wedge plugin server generally, but the plugin system can query all the servers it knows for updates)


I think I almost have to just go away and start building before I can really answer some of these questions, that I won't answer them ahead of time.

(Just a quick note, I took a look today at how WP handles plugin updates on the fly. Interestingly, it's practically the same as I'd outlined Wedge would have to be doing, that the download occurs, the old plugin is disabled, the new one enabled, and the old plugin then deleted. I still think the user should perform those steps, though, rather than entirely automating it. I don't know. Might have to experiment with how well that works out for us.)
Posted: October 14th, 2011, 01:15 AM

Oh, and I'm going to have to write a plugin server of my own before long, as I've always (always!) kept my webserver's folder as a copy of the files, rather than directly hosting SVN files. In this case, C:/Dev/wedge_plugins currently holds all my plugins as a SVN tree, but C:/Dev/public_html/smf/wedge/ is where my testing site is... would be nice to have a solution for grabbing plugins that was quicker than just copying folders (and, as a side matter, didn't include .svn folders)
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 16th, 2011, 11:53 PM
Bump for feedback. I've been doing work on other parts of the system that will feed into this... before long I'm going to have to implement this and would rather do it once and do it right...
Title: Re: Plugin servers / getting plugins to a system
Post by: DarkLite on October 17th, 2011, 12:15 AM
I've spent a depressing amount of time lurking as a guest just to follow Wedge's progress, and thought I'd throw in my £0.02 here.

While I agree that in principle downloads and uploads are two very different things and would seem to require two pages, I don't think the upload page has enough content to merit a separate page. I can't envision anything other than a single "choose file" dialog (but there may well be other possibilities I haven't thought of?).

As a result, is it really worth splitting it off into a page with one button and no options? It'd look somewhat odd to the average end user, given that no other admin page contains so little stuff (even the Languages->Settings page has some options).

What about rebranding "Download packages" as "Add a package", rather than splitting it up? I think that would cover both "upload" and "download" scenarios, with the uploads being handled by a button at the top and the downloads by the server browsing system below.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 17th, 2011, 12:30 AM
That makes sense, probably more than splitting them up, though the download process is likely to be more powerful than before, so I was also debating putting the whole 'package server' management into its own page, because it does need a shade more in future.

There is actually one page that does have so few things on it - the caching page. For most of the options therein, you don't get any configuration other than to pick caching type.

(Also, welcome to Wedge :))
Title: Re: Plugin servers / getting plugins to a system
Post by: DarkLite on October 17th, 2011, 12:51 AM
Quote from Arantor on October 17th, 2011, 12:30 AM
That makes sense, probably more than splitting them up, though the download process is likely to be more powerful than before, so I was also debating putting the whole 'package server' management into its own page, because it does need a shade more in future.
The upload system wouldn't take up much screen real estate: would there really not be room for a "choose file" button at the top of the page?
Quote
There is actually one page that does have so few things on it - the caching page. For most of the options therein, you don't get any configuration other than to pick caching type.
In the case of the caching page, it's got options. A page consisting of a single button offers no choice, and I don't think most people would see a reason for the page to exist (particularly when there are other closely related pages it could be merged into).
Quote
(Also, welcome to Wedge :))
Thanks!  :D
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 17th, 2011, 12:55 AM
Quote
The upload system wouldn't take up much screen real estate: would there really not be room for a "choose file" button at the top of the page?
Take a look how big the current download page is. It doesn't fit on the screen as it is, and it's going to get more complex in future, especially with the browse stuff being intentionally more prominent.

Plus, I do want to put some notes on the page about people uploading things manually (since there's little real difference, but some people are funny like that, and it's possible that the normal code won't be able to run for whatever reason, however unlikely)
Quote
In the case of the caching page, it's got options. A page consisting of a single button offers no choice, and I don't think most people would see a reason for the page to exist (particularly when there are other closely related pages it could be merged into).
A page consisting of a single button might not be too much use, but there must be at least two buttons (a pick-files and a submit button). I did also consider making it possible to upload multiple plugins at once should that be so desired.
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 17th, 2011, 07:51 AM
Am I needed here?
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 17th, 2011, 09:07 AM
Well, I'm looking for general feedback on all of the above points, really, in terms of usability, practicality, that sort of thing.
Title: Re: Plugin servers / getting plugins to a system
Post by: godboko71 on October 17th, 2011, 05:01 PM
*Upload from Computer* Could be a Jquary type pop in and be a separate page as a fall back when JavaScript is disabled or not working correctly. Just a thought. Either way though I think it can be its own page.
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 17th, 2011, 05:13 PM
Quote from Arantor on October 14th, 2011, 01:24 AM
OK, let's recap.

.zip vs .tar.gz.
There are more rugged solutions than read_zip_data() out there, one of which is ZipArchive. It's not the only solution, however, and it won't be that hard to find such a solution.

My theory of using only one file type in general just means that it simplifies the process and the code, and minimises the overall number of hassles people have, IMHO.
Sure.
OTOH, the only argument I have against this -- the plugin server would see a significant bandwidth increase if it can't server gzipped content...
Quote
Should uploading plugins from your computer be a separate page to downloading from a plugin server?
I'd tend to put everything into the same page, but really that's because it's already the case in SMF.
Other than that... I don't really bother either way.
Quote
Should a plugin be able to list the plugin server or servers where it can be found?
Hmm... I don't know :^^;:
It would have to be seen, first, if anyone creates 'independent' plugin servers that host plugins by multiple users.
Other than that, it'd be just simpler for us to provide RSS feeds on the Wedge plugin server with an author filter. i.e. people could be notified when their favorite authors release new plugins or new versions. Totally missing from the SMF customization site, eh...
Quote
(Just a quick note, I took a look today at how WP handles plugin updates on the fly. Interestingly, it's practically the same as I'd outlined Wedge would have to be doing, that the download occurs, the old plugin is disabled, the new one enabled, and the old plugin then deleted. I still think the user should perform those steps, though, rather than entirely automating it. I don't know. Might have to experiment with how well that works out for us.)
We could add a 'noob-friendly' option to do everything automatically.
If it fails --> manual processing.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 17th, 2011, 05:25 PM
I'm really not that enthusiastic about putting a jQuery widget in for something like that. For uploading multiple plugins, maybe, but anything more than that is IMO unnecessary, as this is something I don't want to spend vast amounts of time on, knowing that most users won't upload multiple plugins at once, probably won't even upload plugins all that often since it should be possible to handle from downloading better...
Quote
Sure.
OTOH, the only argument I have against this -- the plugin server would see a significant bandwidth increase if it can't server gzipped content...
Well... plugins would still be compressed, and you have to get to a largeish package before it makes a big difference. SD 2.0 for example is 553K for ZIP vs 404K for .tar.gz, and mostly that's because it has a lot of files in it, of which most won't compress all that well.

The other thing is preference. The majority of plugin authors are likely to be on Windows where making a .zip file is easier.[1]

The alternative is forcing .tar.gz across the board, which is smaller generally. Our plugin server could unpack zip and repack as tar.gz on upload. Those running custom servers would have to just deal with it, I guess.
Quote
I'd tend to put everything into the same page, but really that's because it's already the case in SMF.
Other than that... I don't really bother either way.
Well, if it remains as one page, it desperately needs renaming. I don't really have a problem with renaming it, just that my first instinct was to make it a separate page, but I'm increasingly agreeing with the view that it doesn't need to be.
Quote
It would have to be seen, first, if anyone creates 'independent' plugin servers that host plugins by multiple users.
Other than that, it'd be just simpler for us to provide RSS feeds on the Wedge plugin server with an author filter. i.e. people could be notified when their favorite authors release new plugins or new versions. Totally missing from the SMF customization site, eh...
Well, this is where it gets into the technicalities of providing support. I suggested doing it as a REST style but I don't think that's viable in the long run. In fact, I'm sensing it would have to pretty much be a SOAP-type request to the server to cope with everything.

What that ultimately means is that the requester sends an HTTP POST with the body being a block of XML. (Or POST vars. Doesn't really matter. The key point is that you make a POST with one or more variables attached.) Then you get a block of XML back in some form.

SOAP formalises this process, and I don't think we need go quite that far, but certainly it needs more than a simple URL request, since the process has to account for various kinds of filtering.
Quote
We could add a 'noob-friendly' option to do everything automatically.
If it fails --> manual processing.
I wasn't really going to do anything else. It's not like it's that hard to automate that process, especially since the user will have likely had to provide their FTP/SFTP details at some point (could even do it ahead of time so that the details are known before even trying to download)
 1. Don't flame me and tell me that it is or isn't easier on Linux. I honestly don't care. Most people here will be using Windows on the desktop, and it's directly built into the file manager and has been for years. I don't recall the same finesse on Linux distros.
Title: Re: Plugin servers / getting plugins to a system
Post by: spoogs on October 17th, 2011, 05:46 PM
Quote
Well, if it remains as one page, it desperately needs renaming.
This is what I've been thinking all along^^ the only real issue is the name of the page. I can't speak much as how things should be handled with the zip vs gz stuff but as far as the layout presented to the user 1 page is best. SMF's layout isn't really that bad (though it could use a face lift) it's really just the name of the page that's an issue.

IIRC live's theme Grace actually went a step further by providing the options for acquiring packages right above the list of mods which I thought was interesting looking.

As far as the name of the page... maybe View/Install Plugins or Acquire Plugins or something of the sort I suppose.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 17th, 2011, 06:24 PM
OK, so one page then. Question of the day: does anyone actually use the 'download by URL' option?
Title: Re: Plugin servers / getting plugins to a system
Post by: spoogs on October 17th, 2011, 06:27 PM
I've never used it.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 17th, 2011, 06:37 PM
More to the point, neither have I, and I think I can safely (and very legitimately) say I know SMF's package system better than most people. In fact, I'm still not entirely clear what actual use it does have. If something is serving packages through URL, you either download it to your PC and upload it, or it's going to be a package server anyway.

The *only* scenario I can envisage is for the case of a package not on a package server and you're using a phone to add a plugin. Which, frankly, is asking for trouble anyway.
Title: Re: Plugin servers / getting plugins to a system
Post by: spoogs on October 17th, 2011, 06:43 PM
I actually typed up almost the same thing but edited it to the simple 1-liner above :P

I agree it's a more natural reaction to download the pack as you're already there than to copy the url. The phone option hadn't crossed my mind but then again I've never even attempted to get the to ACP from my phone anyway.

If that option is going to be kept it can probably be made the last option on the page then.
Title: Re: Plugin servers / getting plugins to a system
Post by: Dr. Deejay on October 17th, 2011, 06:53 PM
I use it often when I'm too lazy to download the packages to my pc and find the according package. But even if it's going to stay (I'm guess it will be removed because most people don't use it), it sometimes just don't work
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 17th, 2011, 09:38 PM
But everywhere I know that offers mods also offers a package server...
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 17th, 2011, 10:03 PM
Quote from Arantor on October 17th, 2011, 05:25 PM
The other thing is preference. The majority of plugin authors are likely to be on Windows where making a .zip file is easier.
Upload zip -> Wedge server receives zip -> Wedge server repackages it as tgz -> Wedge server serves tgz when sending data across servers, and serves zip on request (manual download.)
Quote
The alternative is forcing .tar.gz across the board, which is smaller generally. Our plugin server could unpack zip and repack as tar.gz on upload.
Just as I suggested, lol... ;)
Quote
What that ultimately means is that the requester sends an HTTP POST with the body being a block of XML. (Or POST vars. Doesn't really matter. The key point is that you make a POST with one or more variables attached.) Then you get a block of XML back in some form.
Isn't that what xhr is all about...? :P
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 17th, 2011, 10:10 PM
Quote
Upload zip -> Wedge server receives zip -> Wedge server repackages it as tgz -> Wedge server serves tgz when sending data across servers, and serves zip on request (manual download.)
Yup, that way we'd be providing tgz across the board. Though I don't see a need to offer both, just do one.
Quote
Isn't that what xhr is all about
On the browser side, yes, but I was thinking about doing it on the server instead, to avoid having to cope with cross-domain requests where it won't be sending everything in the URL...
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 17th, 2011, 11:58 PM
OK, so I've thought about this a bit more, after taking some time to review fetch_web_data a little bit, since that's quite a neat little function.[1]

I can't quite make up my mind whether to use HTTP headers as part of the request (and make it a GET) or put the contents in a POST body, either should be doable, though I guess using a GET is more semantically accurate.[2]

In other news I'm thinking of expanding fetch_web_data to support getting an arbitrary number of bytes at the start of a file and making it use the Range headers. The reason? Oversized, non attached images. When someone posts using those, the URL is fetched, the entire image is loaded into memory and its size established with the GD image info functions... but that means downloading the entire image. For the purposes of getting its size, getting the first 1.5K should be sufficient and we can do it ourselves from the file headers.
 1. I mean, in an ideal world, it would be using cURL and no hosts would be putting stupid restrictions in that necessitate such measures, but in a cURL-less, stupid-host-filled world, it's pretty neat.
 2. Especially as GET methods are supposed to be idempotent.
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 19th, 2011, 10:26 AM
Re: cUrl, here's a 3-year-old post by yours truly about it :P
http://www.simplemachines.org/community/index.php?topic=282969.msg1858080#msg1858080
Latest mod version: http://custom.simplemachines.org/mods/index.php?mod=1569

Use GET if you feel like it's better. You know you want it :P

Range header support in fwd would be cool. aeva_getMedia() already supports it and that's pretty cool :) I thought SMF had it in the first place... My mistake.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 19th, 2011, 10:59 AM
Well, if I add cURL support (by default) to fetch_web_data, I'll be sure to add in the facility to add arbitrary headers which means range support is a step away if anything wants it. The plugin manager will need that if I use GET, and then for the places outside media which use it, it's pretty much a one-liner to deal with.

Debating whether or not to convert fetch_web_data into a class so that setting things like URL, headers etc don't make for a very long function call, though they would make for 2-3 lines of calling instead of the current 1-2 (allowing for inclusion of Subs-Package as it stands right now)
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 19th, 2011, 11:25 AM
Why 3 lines..?
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 19th, 2011, 11:43 AM
The whole point of doing it is to avoid ridiculous and complex function calls that have multiple parameters that only apply in certain cases. If I'm keeping it to two lines, I may as well not bother with making it a class.

Consider, which is easier to follow... this is a hypothetical example using arbitrary GET headers.

Code: [Select]
loadSource('Subs-Package');
$code = fetch_web_data($url, '', false, 0, array('Range' => 'bytes=0-1536'));

Code: [Select]
loadSource('Class-WebGet');
$wget = new weweb($url);
$wget->setHeader('Range', 'bytes=0-1536');
$code = $wget->get();

The extra values in fetch_web_data are in order: POST data, whether to use keep-alive and redirection level. I'd assume the same defaults for weweb, though. This is a more extreme example, a typical example would normally only have the function declaration and the get. (If you put everything into the constructor, there's really no benefit to it whatsoever because you're just making it a bastardised function call.)
Posted: October 19th, 2011, 11:35 AM

Hmm, after a quick skim of the docs, I wouldn't be adding range support like that anyway, because cURL has proper support for ranges and expects a proper curl_setopt call for it. Still, I'd just convert that to a setRange() call, so that whether cURL or fsockopen is doing it, it can be done.
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 19th, 2011, 11:45 AM
I see what you mean... Well, considering the function isn't used an awful lot, both versions are fine by me, so use whatever you want ;)
(As for the object name, 'weget' would make a bit more sense. Class-WebGet for the filename is fine.)
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 19th, 2011, 11:47 AM
OK, I'll work on this one today then :) And yeah, weget is a better name.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 19th, 2011, 12:32 PM
In other news, I can only find one instance where fetch_web_data actually uses the keep-alive facility, and it's one that I think has dubious use in the future - it's used when using the existing package manger, when downloading a package, to grab the file and validate it. But I can't actually see where it's *using* keep-alive. It's not like it's going to attempt to reuse that connection between requests.

Hmm, makes me wonder whether it's needed or not - I guess it probably should be supported, in case a plugin wants it.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 19th, 2011, 12:55 PM
After getting more and more involved in it, I think it's really not worth the effort. We're not doing it on the browser where it would really make a difference. In our case, even if several requests are being sent to package servers, the time apart is still going to be several seconds, so that keep-alive is bordering on the pedantic in terms of saving (the overhead is that we save a TCP connection, at the cost of tying up the destination host's resources longer, and for the number of requests we'll be making, it would be better not to do so at all)


(If a plugin author ever did want to, assuming they cared enough to realise there's a very very slight performance gain to be made for multiple requests to the same server, they'd be using cURL flat out to do it, and using HTTP/1.1 with some of the other gizmos there. But I'd wonder what the hell they were doing to necessitate it anyway...)
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 19th, 2011, 03:12 PM
Hmm, further to the above, I do actually need to use HTTP/1.1 for Range support, that's cool because it's only actually an issue in fsock cases since cURL will just deal with it transparently. Interestingly, I think that's actually a bug in fetch_web_data, it issues an HTTP/1.0 block with Host as a supplied header, which didn't exist in HTTP/1.0.

Still, it doesn't change my modus operandi; I'm still rewriting to exclude keep-alive because we're not keeping connections alive between requests in almost every case.
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 19th, 2011, 03:42 PM
I'll trust you on this :P

And yeah, keep-alive is pointing if we're only downloading one file... AFAIK, the only point is to allow for more simultaneous downloads off a single server by reusing earlier connections.

Oh, and can you look into reusing weget for AeMe as well...?
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 19th, 2011, 03:49 PM
Yeah, it's pretty headache inducing, but it's been interesting to read up on the finer points of HTTP.
Quote
AFAIK, the only point is to allow for more simultaneous downloads off a single server by reusing earlier connections
Yup. Specifically it saves you the lookup at the TCP level, which if you're doing a lot of work is worth saving but for individual file requests, it's not worth it - even if you do two or three in sequence (e.g. browse list of plugins, browse category within list) it's still not actually going to benefit you too much and in almost every case it would be better to let the host have the TCP connection back instead.
Quote
Oh, and can you look into reusing weget for AeMe as well...?
Sure thing. Once it's tested, anyway ;)
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 20th, 2011, 02:19 AM
After a day of swearing and wrangling I've got something I'm reasonably happy with, but I'm not quite as happy as I could be.

GIF and PNG work well enough, they only need the first dozen bytes or so but JPEG is a pain because the header is not necessarily where you expect it to be. Plans of just grabbing the first KB went out the window as soon as I started investigating JPEGs, I bumped it to 8KB pretty quickly, and I'm currently working on 16KB.

Still, downloading 16KB to determine file size isn't bad. Of the 15364 files tested with either .jpg or .jpeg extension on my PC, 15039 were able to have their file sizes detected to be the same as what getimagesize returns. Of the 325 remaining, there are a number that are damaged files anyway (recoveries from old HDs and so on) and the rest, pretty much unilaterally, have the relevant marker block after the 16KB boundary. Larger files (even 512x512) sometimes have the boundary at 20KB into the file.[1]

So at this point I'm playing the game of diminishing returns, I can either up the boundary and accept the inevitable loss of performance or I can cut my losses.

Hmm, let's run some stats and get a proper handle on the state of play.
* 16KB: matches 15039 out of 15364 (97.8%)
* 32KB: matches 15246 out of 15364 (99.2%)
* 64KB: matches 15284 out of 15364 (99.5%)

Of the 80 that couldn't be matched at 64K, only 3 weren't corrupted, and they're ones that are special, using CMYK colour separation, so that some browsers don't render them properly anyway.

So we're really looking at 15287 files not 15364 here, which puts the percentages at:
16K: 98.4%
32K: 99.7%
64K: 99.9%

I think I'm going to call it a day here and commit things with the 16K version; we can always bump it up to 32K (or provide instructions) if need be.
 1. Just consider this for a moment. We're talking JPG. It's not layered, not transparent, doesn't have multiple 'images' inside, and yet we have 20KB of stuff before we get to set out the *size* of the image. Funnily enough I can tell you exactly what tools cause that, and one company in particular... :whistle: I'll leave you to guess which company.
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 20th, 2011, 08:02 AM
I read into your code and you deleted the original SMF code (imagesize and stuff), maybe you should restore it and use it as a fallback for when $size isn't set at the end of the function...?
Okay, that means two hits to the remote server but blah. And an overload of 16KB compared to the older SMF function, but that's only for <1% of all JPG files so that's a very, VERY fair trade...

(Or we could re-try with a 128KB buffer. But then it won't take other files into account -- e.g. slightly broken PNG and GIF. Not that imagesize can magically get their sizes either.)
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 20th, 2011, 09:16 AM
Quote
maybe you should restore it and use it as a fallback for when $size isn't set at the end of the function...?
I'd rather not, to be honest. The likelihood is that if it's failed getting it from 16K (or 32K), it's probably a big file, which means the original code is going to have to retrieve the entire file which on any host with low settings is going to cause a white screen.

I'd rather go to a 32K buffer than have to hit the server a second time (though the code had a habit of making two requests so even if we did a 16K hit generally and a further 64K hit if it's JPEG and we didn't find it in the first 16K, that would probably not be a killer.
Quote
(Or we could re-try with a 128KB buffer. But then it won't take other files into account -- e.g. slightly broken PNG and GIF. Not that imagesize can magically get their sizes either.)
If it's broken, it's broken, no matter how big the buffer is. There's only any point doing a retry if you can theoretically get some mileage out of it; I wouldn't have it retry on images that show up as PNG or GIF, or things that don't flag up as JPG.
Posted: October 20th, 2011, 08:50 AM

That said, the current code can be refined. Right now it assumes a range will be provided, of up to 16K - but support of ranges is not required (though strongly recommended) in HTTP/1.1 servers, and if the server ends up throwing a large JPG at the user, it's going to step through the file looking at the boundaries. Consequently, if the file is bigger than the specified range, we can legitimately assume it's the whole file, rather than a ranged subset, and apply getimagesize on it.
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 20th, 2011, 09:52 PM
Doesn't imagesize work with a handle or something? I don't reMember exactly.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 20th, 2011, 10:08 PM
I know there have been out of memory errors from its use, so I can only presume it doesn't.
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 20th, 2011, 10:57 PM
Hmm I just checked, it needs a filename... So that would require saving the data to the local hard drive, and then calling getimagesize on it. Seems a bit unrealistic to me... :-/
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 20th, 2011, 11:01 PM
Hmm, yeah, which is impractical and a PITA (and what was there only works if allow_url_fopen is set)

In which case, the only solution is ugly and rather costly in memory terms (but I can put a test to try and prevent it dying due to being out of memory), which is to call imagecreatefromstring - since we have the string handy - and build an image out of it for imagesx and imagesy. That seems familiar, SMF may even have done it but with all the code and staring at hex dumps of files in the middle I may well have forgotten about it :/

I didn't even realise until yesterday that fetch_web_data not only supports FTP but also arbitrary HTTP POST content. (Of course Class-WebGet *also* supports those things, which I suspect is not so well realised in the first case)


Getting back to the original topic, I realised a problem that I don't know how to deal with: authentication. I originally figured I'd send the username and password to the server as part of the request, but I realised that if the destination wasn't open to guests, it'd never reach its destination at all. So either I have to put in the requirement that a destination forum/plugin server be open to guests (since I'm looking at making it a plugin itself), or have it do an authentication and login as the first request, send back the cookie (and have that stored somewhere temporarily), then resend the actual request for updates complete with cookie.

Hmmm. It seems so much easier to make the plugin server be a separate file that can authenticate on its own without having to create a user session etc. Maybe I should do that (though that has its own interesting side effects, and I'd rather make it an action)
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 21st, 2011, 12:48 AM
Specifically on the last part, I'd appreciate feedback from people who might build paid repositories of plugins: would having 'allow guests to browse the forum' be a huge problem? I'd personally assume not, because of having an area outside the paywall, for pre-order questions and any freebies you offer too.

(This assumes that there will be external sites, pretty much needed for paid resources because there is no way I want wedge.org to have to cope with the legal liability issues that arise out of hosting the plugins like a store)

Reason I ask is that I can safely get round the above issues and avoid having to create a separate file etc, and just have it as a regular plugin through the actions facility.


(I am also interested in providing a similar framework for themes, and will likely base a lot of it off the core from the plugin manager but that's another day entirely.)
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 21st, 2011, 03:10 PM
I'm guessing that the above limitation isn't really an issue... after all, as discussed elsewhere today, the best promotion of paid work is free work, and I really can't see having totally locked off repos in any meaningful fashion.


Going back to earlier in this discussion, I've been trying to figure out how tar.gz support should theoretically work if it's going to be provided for. I see two major issues that need resolution.

Firstly, detection of (probably) validity of package. With .zip packages, there's a central directory of what's in it, which is accessible without unpacking the file directly, so we can test for plugin-info.xml quickly and relatively easily. (Even if it's in a subfolder, as /plugin/plugin-info.xml instead of /plugin-info.xml)

No such luck with tar.gz; it's not one file format, it's two, which means we have to unpack it physically as a file first to unpack it from gz format into tar, and even then we can't just arbitrarily dive into a tar. It is almost certainly nested (SMF is aware of this fact and has specific recursive code to cope with it) and the way it's done makes it harder to figure out if the plugin is valid up front or not.


Secondly, unpacking. Because of way zip files can be accessed, it's possible to do what we need: create a new folder and unpack things into that. Untarring is a lot more complex and will almost certainly require not only unpacking, but unpacking and then shunting files around after because of the inherent way it handles nested folders.


I'm hoping SMF's tar.gz unpacker might be salvageable but you never know. There's going to have to be some rewriting going on to support the remote file addressing stuff... maybe it's doable, we'll see. It's likely going to be less efficient than SMF's was, given the changes in how things are done but it should be more reliable in the long run.
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 21st, 2011, 05:02 PM
After spending the last two hours trying to figure out the mechanics of how this would have to work, I have reached only one conclusion.

If people want to use the upload/download facilities, safe mode has to be off, and I think I'm just going to block out the entire area if safe mode is on, because there's so many ways it can screw up operations.

Note that it still doesn't solve chmod issues, even with safe mode off, but that it causes a great number of other issues on top of that.
Title: Re: Plugin servers / getting plugins to a system
Post by: Nao on October 21st, 2011, 09:05 PM
I'm all for *less* issues!

I'd rather tell someone to 'use a good host' than 'this is normal behavior, you can fix with this or that'...
Title: Re: Plugin servers / getting plugins to a system
Post by: Arantor on October 21st, 2011, 09:14 PM
Yeah, that's why I'm going to be disabling access with safe mode. If you really want to use plugins in safe mode, unpack them to your PC and upload via FTP.

chmod on the other hand is still the greatest PITA it has ever been.