This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
31
Off-topic / Re: htmlspecialchars while inserting into DB
« on June 21st, 2011, 05:00 PM »I think it's important that some element of that is continued to be done, but how far do you go with it? At what point is it acceptable to trade off performance for security?
Ultimately, for that, I think several things:
1. String parsing in PHP is slow.
Actually, it's slow in a lot of languages (but especially PHP), because they use a NUL at the end of the string, and parsing is slicing heavy. I think for larger forums, the only good solution is to write (parts of?) the parse_bbc algorithm in a tighter language, such as D. This could be done with basic IPC kinda like memcached works.
2. Caching posts needs a bit of a better solution
A lot of people want to just "cache everything." Throw physical boxes at the problem, and use all of them as ram sticks, and just cache, cache, cache, cache. This is expensive, and becomes hard to manage, expunge, etc.
What I've done is an "indexed cache", which means I mark entries in the cache that are popular, and over time, cache those popular items. Then when they get unpopular, they are garbage collected. We use this at work, and it requires very little memory, but significantly benefits viral traffic (which I'm pretty sure is the same sort of traffic forums get.)
3. The parse_bbc() routine is far from perfect
The way TOX-G parses is much better. The BBC parsing routine was written in a similar way to how one would write it in a language like C. I naively hoped PHP would turn this into optimal code (and it was the easiest way to write anyway), but it doesn't.
4. Someone needs to profile parse_bbc()
I think Groundup was trying to do this, but I don't know how much he did with it. It needs to be broken up into smaller functions - ideally like ~5 of them or more - so that the performance problem can be identified. I have some ideas where performance may be bad, but guessing isn't the best road to improvement.
On the flip side, there are things like topic subjects, which aren't resanitised on display, what's in the DB is assumed to be safe.
It reattributes old posts to a new user, e.g. new account after deletion, but it doesn't do topic ownership at the same time.
You know something's wrong when they're accepting mods on the mod site that are bug-fix mods. I only wish I were kidding.
You can say that about a lot of things, though. It's almost the same argument against something like Microsoft Word: just because you might only use 2% of its features, that sounds like an argument for stripping out the rest - until you realise that everyone else uses a different 2% of its features.
-[Unknown]
32
Off-topic / Re: Post-XSS scenarios and database driven sessions
« on June 21st, 2011, 04:36 PM »I've really been looking into it even though I was 16 hours overdue for sleep. So the best way against XSS is to prevent it, I've realised that regardless of the measures the session will be stolen. So it is mostly in hands of htmlspecialchars, is there other way to do XSS even if the output is htmlspecialchar'ed?
As far as CSRF go, I guess that's what those session tokens in SMF are for?
Also as far as iFrame go, a person can't really steal a cookie from iframe correct?
For CSRF, actually, using XMLHttpRequest and setting custom headers can really help here. But yes, using SSL and using some sort of token in the URL works well.
For iframes, there's X-Content-Frame-Options or something like that. You can't steal a cookie, but clickjacking is a significant security concern.
<IfModule mod_headers.c>
Header add X-Content-Type-Options nosniff
Header add X-Frame-Options SAMEORIGIN
</IfModule>
-[Unknown]
33
Off-topic / Re: htmlspecialchars while inserting into DB
« on June 21st, 2011, 10:21 AM »I wonder if preparsecode's changing everything into br tags is a holdover from YaBB where everything was in flat files. (I don't know how it was stored, but it seems feasible to me)
That's one of the things we can do something about. We've expressly set ourselves on the path of having an importer rather than just 'upgrading' the existing tables - aside from the fact it lets people run them both side by side to experiment, it also means we can do manipulations along the way like fix any of these things we decide to resolve.
Depends what you mean by sanitise. htmlspecialchars both ends strikes me as a bad idea, for example.
Well, the methodology I was referring to was that all mods using the database should use $smcFunc's functions, and should be using the proper parameterisation, or the proper insert method made available that deals with escaping etc for you.
were still deferred through some "well, it might cause regressions" mantra.
At the same time, I agree. Especially with the release of a new major version number, you gotta get it right or you'll pay for those mistakes for a while. I know Chrome and Firefox are going the way of loosey-goosey, but I'm not convinced that will work for either of them long term, especially Firefox.
Back whenever ago, I posted the same basic thing on simplemachines.org; if you need more RCs, release more RCs. There's no shame in releasing RCs. I'm still considering TOX-G alpha, after all (and probably will until I write better docs, some more useful samples, and add a couple nagging features that I want.) It's being used in production by a select few, and I'm ready to help those implementations upgrade if necessary if I have to make breaking changes.
Sad to hear that SMF didn't do the same.
We use jQuery in Wedge. Not just because it means we get to minimise the code to be sent to users (since the admin can pick a CDN copy of jQuery), but the time we spent writing JS is shorter, and I suspect less time is also spent in debugging. Plus a lot of users do add stuff that makes use of jQuery, so having it in the core means plugins won't try each adding it and falling over each other.
[rant]
I don't hate jQuery, but I don't really like it either. It's not a bad choice and very sane for most web apps, though. My problem with it is that it's got a lot of features, most of which I don't need, and doesn't provide most of the features I do. It has animations, but doesn't do colors; it deals with nodelists spectacularly, but makes me learn a second DOM syntax to do so; it has whiz-bang solutions like $.each, but it doesn't have convenience funcs to build this-bound delegates; it makes it super easy to build html, but in a way that encourages XSS-vulnerable code; and it supports xmlns selectors but doesn't make them compat in IE browsers. Off the top of my head.
[/rant]
But yeah, it's popular and people are learning it, so it makes sense.
-[Unknown]
34
Off-topic / Re: Post-XSS scenarios and database driven sessions
« on June 21st, 2011, 05:14 AM »- IP checking (Can't find a way around it)
- Small cookie time length
Small cookie timeouts are a good idea, as long as you use a session keepalive (or are fine getting logged out constantly.)
The best way to do "remember me" is like another form of sessions (just not garbage collected.) This allows you to have a button to log other computers (which each have a separate token) out or etc. SMF tries to do this, and does okay, but the better way is to have each computer use a separate token.
-[Unknown]
35
Off-topic / Re: htmlspecialchars while inserting into DB
« on June 21st, 2011, 05:03 AM »Does this make sense? I've been wondering, wouldn't mysql_real_escape be enough while appending to DB?
From a security standpoint, the assumption should be that everything else has been compromised. From an optimization standpoint, the assumption should be everything else is perfect. Quality lives somewhere between the two.
Generally, I will cast to int things even from the database - because I don't know if my database query had a SQL injection, or maybe something went wrong in my insertion, or another software was compromised and gained access into my database. The less that an attacker can "gain" even after they successfully exploit a small hole, the better.
Plus, htmlspecialchars'ing before you insert in the database increases space requirements, and makes integration with other systems harder.
I never used mysql_real_escape, I always used prepared statement because I find them a lot easier.
I always used htmlspecialchars for data that goes in a html attribute, I rarely use them for anything else.
Also the data that comes out of the database tends to be dirty, it like to keep the back slashes, so clean it with stripslashes() and always before htmlspecialchars().
SMF (and Wedge)'s specific brand of content encoding going into the DB, where bbcode is concerned at least, is slightly odd, and one day I'll figure out exactly why it was done the way it was (remove newlines, expressly inject br tags into the stored content, after htmlspecialchars has been run)
@CJ Jackson: I'd rather not be trying to sanitise on output, I'd rather sanitise it when capturing it so that if something screwball tries to dump the contents of the DB, it's still going to be safe because there isn't anything dirty in the DB.
At least with SMF, mods are vetted and generally have had oddities weeded out
Of people who take it, I think <10% pick a or b. 90% pick route c. It baffles me.
That said, when people do detection or don't check ini settings properly, "just fixing it" can make integration harder.
-[Unknown]
36
Features / Re: Optimize release images
« on June 20th, 2011, 09:39 AM »The avatars folder, yes, they weren't optimised much when I put the xkcd pack together originally, but the rest of the images should be optimised.
svn ls -R | xargs wc -c
170768 total
find . -name "*.png" -print0 | xargs -0 optipng -o7
find . -name "*.png" -print0 | xargs -0 -n1 pngout
svn ls -R | xargs wc -c
165021 total
cd avatars/xkcd
svn ls -R | xargs wc -c
304040 total
find . -name "*.png" -print0 | xargs -0 optipng -o7
find . -name "*.png" -print0 | xargs -0 -n1 pngout
svn ls -R | xargs wc -c
265055 total
cd media/icons
svn ls -R | xargs wc -c
59003 total
find . -name "*.png" -print0 | xargs -0 optipng -o7
find . -name "*.png" -print0 | xargs -0 -n1 pngout
svn ls -R | xargs wc -c
53070 total
That is, of course, counting all images, not just pngs. Not an immense savings, but a savings.
-[Unknown]
37
Features / Re: Optimize release images
« on June 20th, 2011, 09:11 AM »I believe I've already run everything through pngquant. I do it systematically for new files and keep a 32 bit copy in the other/images folder as well. ^_^
I dont automate it because png8 sometimes has awful results in ie6 btw.
-[Unknown]
38
Features / Optimize release images
« on June 20th, 2011, 03:35 AM »
Running optipng on most images can really improve things. At a cursory glance, it appears that many pngs, avatars, etc. can all be optimized with some amount of savings.
I suggest adding it to a release script, so it's never forgotten, something like this:
find Themes avatars media -name "*.png" -print0 | xargs -0 --no-run-if-empty other/tools/optipng -o4
Or even better, run it more often (e.g. when checking in png changes) and then it only has to process the changed files, which is much quicker.
For example, the avatars directory could currently save 40 KB if it were losslessly compressed (about 12%.) This affects both distribution bandwidth and obviously admin's bandwidth. Losslessly optimizing jpegs is a good idea too.
-[Unknown]
I suggest adding it to a release script, so it's never forgotten, something like this:
find Themes avatars media -name "*.png" -print0 | xargs -0 --no-run-if-empty other/tools/optipng -o4
Or even better, run it more often (e.g. when checking in png changes) and then it only has to process the changed files, which is much quicker.
For example, the avatars directory could currently save 40 KB if it were losslessly compressed (about 12%.) This affects both distribution bandwidth and obviously admin's bandwidth. Losslessly optimizing jpegs is a good idea too.
-[Unknown]
39
Features: Theming / Re: WeCSS: the Wedge CSS parser
« on June 20th, 2011, 02:14 AM »
I think you may be dreaming on the documentation front, but who knows, luck happens.
Well, I'm currently more of a fan of lesscss anyway (in part also because of its js-side implementation in addition to js and php server-side implementations.) I know compass is uber popular, but it doesn't seem interesting to me. I can see making the curlies optional, like in PHP with if, but in general I just like curlies.
I definitely want to give it a try. Are you developing it as just part of Wedge or as a discreet module? For any parsing system, like TOX-G, less, sassy, or this wecss, I definitely think test-driven development makes sense - do you have tests or just using the Wedge css as that for now?
-[Unknown]
Well, I'm currently more of a fan of lesscss anyway (in part also because of its js-side implementation in addition to js and php server-side implementations.) I know compass is uber popular, but it doesn't seem interesting to me. I can see making the curlies optional, like in PHP with if, but in general I just like curlies.
I definitely want to give it a try. Are you developing it as just part of Wedge or as a discreet module? For any parsing system, like TOX-G, less, sassy, or this wecss, I definitely think test-driven development makes sense - do you have tests or just using the Wedge css as that for now?
-[Unknown]
40
Features: Theming / Re: CSS and JavaScript minification
« on June 20th, 2011, 01:47 AM »Yes, it's based off Packer 3.1. Dean Edwards has been working on Packer 4.0, though, and he told me he'd be putting the PHP version online soon.
Regarding semicolons, I provided a fix in the source file, but I commented it out because I think it's best to educate devs into using semicolons as much as possible.
I definitely wouldn't want it to just pack my css or js every page view. Not only for the cost of the mtime IO checks, but also because it makes debugging harder by far. I don't know if you do js debugging, but I definitely do, and debugging packed js is similar to debugging an optimized exe with no debug info.
That's why I'd suggest a dev mode switch. Also because, it means you can have the dev mode only apply to administrators (with a conspicuous message, like when you leave upgrade.php uploaded), such that you can "stage" js and tox changes, test them, and then "push them live" for everyone with a click of a button.
Also, are css/js files served with the overhead of PHP? I can show benchmarks that even on nginx/php-fpm, this overhead is not nothing. I suggest providing some way to avoid if possible, as Apache and nginx both have deflate mechanisms, and its support is detectable.
Oh, and I automate everything I can. Take a look at my TOX-G makefile - I don't leave it to chance that I forget the copyright year, or to run the tests, or etc. And I have packing, optipng, etc. all automated as well. If I can remove the human component, I do.
-[Unknown]
41
FAQs / [FAQ] Re: Minimum requirements
« on June 19th, 2011, 08:33 PM »
Would that still make it a requirement? If an optional feature doesn't work, it seems like a recommended thing not a minimum requirement.
When I buy a game and don't see fancy shadows, it's not because my video card doesn't meet the minimum requirements... it's because it doesn't meet the recommended ones.
-[Unknown]
When I buy a game and don't see fancy shadows, it's not because my video card doesn't meet the minimum requirements... it's because it doesn't meet the recommended ones.
-[Unknown]
42
Features: Theming / Re: Template blocks
« on June 19th, 2011, 04:33 PM »Or defined on the settings.XML override file yes.
It is simplistic but it's a good compromise.
Of course ideally I'd be using tox but I'm not sure themers wouldnt be lost. There are so many changes in Wedge already.Posted: June 19th, 2011, 03:38 PM
Oh. And I believe there are several private topics regarding the birth of this feature and comparisons with tox.
-[Unknown]
43
Off-topic / Re: A PHP fork?
« on June 19th, 2011, 11:35 AM »
I find it very interesting. Honestly, I wish I had the time to sink my teeth into something like this. These are some good improvements, and if things like this are getting rejected, I wonder how much good a fork could do in the world...Quote from Arantor on June 16th, 2011, 01:26 PM Yes, although annoying, I agree.Quote from Arantor on June 17th, 2011, 05:12 PM This is called jsonp and has problems with > 4k of data. Some of the "easy" things jQuery does don't necessarily encourage best practice (I know this from having to do code review at work.)Quote from Eros on June 18th, 2011, 02:37 AM
Forcing users to use GET and POST, rather than an ambiguous source is a nice step, though honestly I'd love to see a proper taint detection method such as in Perl, where you explicitly can't do anything to input without some kind of sanity check first.
I don't know which off the top of my head, but if it works how I think it works, it'll be GET - because what it can do is inject a <script> tag into the DOM for the browser to fetch the contents dynamically - and it'll be JSON when it comes in, presumably.
....I wouldn't call the POS IDEs for PHP proper IDE's either. Then again, the only thing I think Microsoft ever did right was Visual Studio so....:/
/quote]
Indeed. I use Phalanger myself for PHP and it works great. I bet it could be hacked relatively easily into supporting a JSON-like array syntax.
-[Unknown]
44
FAQs / [FAQ] Re: Minimum requirements
« on June 19th, 2011, 11:04 AM »
I wouldn't waste my time with the Chrome 1 requirement. You would be surprised at how hard it is to lock Chrome to a single version. I don't think it's worth worrying about older than 6 or 8 or something at this point, and even probably not that far back.
Why might Flash be required?Quote from Nao/Gilles on March 28th, 2011, 09:43 AM I am constantly astounded by how many nifty features (like deferred) I never have any use for, and how many much simpler core things that I've used in my own js libraries it completely misses (like currying.)Quote from Nao/Gilles on March 28th, 2011, 09:43 AM Ha. Well, at least they definitely know what they're doing. It's becoming a standard though, such that I wouldn't be surprised if it goes the way C did: Intel optimizes CPUs for C, so if you don't use C, your code is slower. This is why all languages are based on C's rules nowadays. I expect Jaeger and V8 and etc. to get optimized for jQuery, so it's probably going to become a standard. If it keeps getting more popular, I suspect it'll be bundled and detected at some point by browsers for further perf wins (since we're currently in a perf war.)Quote from Nao/Gilles on March 28th, 2011, 07:17 PM You can get some good wins out of deferring some of the js. I'm going more the js app route myself, so a mere 20K doesn't sound like a lot (although it all adds up.) I'm currently more in the realm of 270KB (before deflate or minification, about 41KB after) but I late-load about half that. This is for a relatively complicated piece of internal software, though.
-[Unknown]
Why might Flash be required?
We included jQuery but mainly for user interest. i.e. we don't actually need it ourselves, but we figured that since it's the de-facto library for developers, many would be happy to see it included by default, avoiding the need to include the library separately (potentially breaking other mods using it in the first place.) Everything is taken care of for them. However, early versions of $ had a reasonable filesize. Now, v1.4.4 is about 24kb after gzipping, which is scandalously big. It pretty much makes it hard for 56k modem users to run a site that uses jQuery. And the newest 1.5.x branch is even worse (I suspect v1.5.3 will finally reach the 30kb limit.)
I guess I still haven't found my peace with jQuery. It has some nice features, but nothing that we couldn't implement ourselves. Hopefully, v2.0 will be modular... But I suspect that even if they added some kind of modularity to it, it would still be bigger than the entirety of 1.4.x.
Re: filesize, it's also less of an issue when files are loaded at the end. That's where perceived loading times come into play. The only thing that's slower in that situation is the execution of JS functions. But that only means your code should behave as if the browser doesn't support JavaScript for a couple of seconds, and then JS takes over. It's as easy as that. (Of course it's not really exciting to write fallbacks for non-JS but you can always simply skip that and just expect people NOT to click everywhere in the first two seconds of loading your website for the first time... Which is, let's just say it, totally unrealistic. The act of clicking so quickly, I mean.)
-[Unknown]
45
Features: Theming / Re: WeCSS: the Wedge CSS parser
« on June 19th, 2011, 10:42 AM »
I belong to the school of semantics, by far, although I think using inheritance is only sensible.
I know I'm often the first person to write my own version of something, and I love learning from it and see if I can do better. And here as well, I think writing a PHP version makes a lot of sense. But, considering Sassy recently moved to using CSS-style syntax, and I personally hate Python style syntax, perhaps it makes sense to gravitate to the rules of another syntax?
That could make the documentation needs lighter.
-[Unknown]
I know I'm often the first person to write my own version of something, and I love learning from it and see if I can do better. And here as well, I think writing a PHP version makes a lot of sense. But, considering Sassy recently moved to using CSS-style syntax, and I personally hate Python style syntax, perhaps it makes sense to gravitate to the rules of another syntax?
That could make the documentation needs lighter.
-[Unknown]