The 7th, 8th and 9th of October saw the fourth PHPNW conference in Manchester, and I was lucky enough to be part of the team helping out on the day. I spent the day before the conference driving speakers from the airport to the Mercure (formerly Ramada) hotel in Manchester Piccadilly, and only partially getting lost in the convoluted mess of one way streets in the city centre.
There was a pre-conference social the Friday night before in Kro bar (in Piccadilly Gardens), which was a good opportunity to catch up with people who come to the conference every year, but also meet lots of new people. Some of the people there attended the full day of tutorials on the Friday, and as I wasn't able to attend those I was interested to see how they went (very well as it happens). I also played some mean games of Mario Kart on the Wii that the user group owns :)
On Saturday morning I got up nice and early and helped to get everything set up before people started arriving. My main task was to go about and put signs all over the place so that people could find their way around the venue before helping out with registrations. We had about 400 people to get registered and inside the conference in about 45 minutes, but I don't think there were many problems. As part of my responsibilities I was to 'shadow' two of the speakers, making sure they were where they were meant to be and giving them cues about how much time they had left. After everyone was registered (with only a few hiccups along the way) we started the day.
Keynote: Standing On The Shoulders Of Giants
Ian Barber (@ianbarber)
The keynote this year was an inspirational journey through the entrepreneurs of our industry (starting with Alan Turing) and how they each took the ideas before them and improved them. Everything from open source software, platforms, the hardware that runs them or the academics writing papers about graph theory 20 years ago has been built upon by successive generations of engineers. It was a nice first talk of the day, and wasn't too heavy for first thing on a Saturday morning.
Photo by Stuart Herbert
All of us at the conference were there to improve, whether individually or the through improvements to the technology we use in our professions. It is quite easy to take a look at Reddit or the latest github projects and guess what is going to be the next big thing, but we can even make our own future by rolling our own or joining an existing project.
Facebook is a good example of something that came from many parts, built by others, and Ian used this as a first example to emphasise his point. Facebook originally stole the idea of friends and profiles from a site called Friendster, which couldn't compete in the global market as it couldn't scale. The chat system in Facebook was built upon the technology behind AIM, which was a popular chat system in the 90's. The main success of Facebook came from their ability to scale slowly. It was originally only availably in Harvard university, and with each new university Facebook could gauge how much more capacity they would need and build its systems slowly.
Programming languages can also be seen to improve upon the work of others. In 1966 Martin Richards at Cambridge University created a language called BCPL, which was adapted by Ken Thompson into B, which was then build upon by Dennis Ritchie into C. The C language is what Rasmus Lerdof originally used to create his first version of PHP (then called PHP/FI). This was then built upon by Andi Gutmans and Zeev Suraski to create PHP3 and the Zend Engine, which is still in use to this day.
Improvements are not linear, and a good example of this is the quicksort algorithm. This was originally created by Tony Hoare in 1960 and was in use in multiple systems for nearly 50 years until a guy on a mailing list called Vladimir Yaroslavskiy proposed a new version called a dual pivot quicksort. This new quicksort algorithm was more efficient than the old one and was quickly adopted by major software firms. It is now part of the Java language.
It is also important, when building upon the work of others to question conventions, which Google are especially good at. Back in the 90's the main search engine was AltaVista, and it was largely considered to be already at the edge of what was possible to do with a search engine. Larry Page and Sergey Brin proved this wrong when they created the page rank algorithm, formed Google, built upon the work that AltaVista did and made it better.
The Internet comes from people that published what they did so that others could build upon their experiences. Companies like Amazon and Google wrote papers and produced software that were used by Facebook when they wanted to build a big architecture for their systems.
Ian also said (to which I heartily agree) that community is important and sited Lorna Mitchel as an example of this. Lorna went from a talk at a barcamp about SVN to an expert in open source version control, and this was accomplished not by sitting at a desk, but by talking to and working with other members of the (mainly PHP) community.
Failures are an important learning tool, but if we all play our part in building tools that others will use and build upon then the industry will flourish.
Zend Framework 2: State Of The Art
Enrico Zimuel (@ezimuel)
I have seen the video of Rob Allen talking about what's new in Zend Framework 2 last year, so I was interested in seeing what Enrico would say, and when the new version would be available. Although he fastidiously avoided announcing any release dates he was still able to engage my interest in some of the new features available.
Zend Framework 2 will be an evolution of Zend Framework 1. There are a couple of dev snapshots available, with a full beta version expected this month. The new version has architecture based upon dependency injection, events, interfaces as well as the MVC model. The events will form an event driven architecture, which will be a major component of Zend Framework 2.
A number of core methodologies are being used in development, including the decoupling of some of the components of the framework, the introduction of an event driven model (through Zend/EventManager) and a set of standard interfaces. The standard interfaces (Zend/StdLib) will provide a better interface between components. Version 2 of the framework will also take advantage of the features of PHP 5.3+ with things like closures. The motto of the framework is 'Consistency and performance'.
The autoloading parts of the framework have been given a complete overhaul, with the complete removal of require_once() calls from within the code. There are now three ways of loading classes.
- The include_path autoloader classes from Zend Framework 1 are still there, and will work in the same way. The hope here is that any application using the original framework can be easily ported to the new framework without rewriting lots of code, which is important.
- Using namespace and/or prefix autoloading with the registration of namespaces.
- Class-map autoloading is where you create a big array that tells the autoloader what each class is called and where it is. They have also created a tool called classmap.php which will generate this list of class names and locations. This tool will eventually be incorporated into ZendTool.
Although the class-map autoloading idea seemed a little crazy to me Enrico spent a few minutes talking about the performance of each, which was a clever move as I'm sure I wasn't the only one with that question in mind. Class-map autoloading showed a 25% improvement over the old Zend Framework 1 autoloader method, jumping to 60-80% when an opcode cache (like APC) is in place. Pairing namespaces/prefixes with specific paths shows more than 10% improvement, with 40% improvement with opcode cache in place.
All of these autoloading interfaces can be moved into Zend Framework 1 applications by simply moving the needed files into the correct place in the Zend folder. The only requirement is that PHP 5.3 be installed.
The new dependency injection systems within Zend Framework allow for some clever inclusion of different classes. This is done by code, configuration or by annotation.
There was previously a problem when trying to either introduce logging or debug points into framework code, or when allowing users to add caching without needing to extend framework code. Out of these problems the Event Manager was born. This is a class that aggregates listeners, each of which has one or more events that are triggered when certain system calls are made. From what Enrico said it looked like the Event Manager is going to be a pretty major system component.
The MVC model in Zend Framework 2 will have a flexible, event driven architecture where controllers are dispatchable objects. It will also embrace the module architecture more so that the application is split into a series of different modules outside of the 'public' directory.
The packaging system being implemented involved creating a package manager (quite like Pyrus) as the primary distribution mechanism. This means that both individual components as well as bundles can be packaged and distributed. The package management tool will be able to handle dependencies, as well as files like zip, tar.gz and even phar packages. It is still under heavy development so not many details were given.
The team behind Zend Framework 2 are trying to make the migration of existing applications from ZF1 to ZF2 as painless as possible. The main idea here has been to allow the migration of code without having to rewrite much code. There is apparently a Zend Framework 2 migration prototype available, but I didn't write down the link from the slide and I can't see the slides anywhere online for the talk. If I do come across it I will update this post. One of the main features is no more assignment from the controller to the view, any variables can just be returned from the controller action.
Enrico admitted that the new version is taking a while, and put this down to the fact that the process was so open. There is lots of interaction with the community and it is the community that matters, it is an open source project after all. If you want to get involved in creating Zend Framework 2 then you can visit Matthew Weier O'Phinney's blog post or visit the official website at framework.zend.com.
Enrico was hesitant to announce a release date for ZF2, but he did say that a beta version should be expected soon (perhaps even November), and that everyone at the talk was welcome to test it and contribute to it.
PHP Testers Toolbox
Sebastian Bergman (@s_bergmann)
This talk consisted of a tour through the different testing tools and frameworks available at the moment. As the creator of PHPUnit it was interesting to see how Sebastian found using and configuring other testing frameworks. It was clear that he had tried out each of the tools on offer as he was able to comment on how easy they were to set up and how they accomplished things like mocking.
There are lots of different test frameworks available, and as Sebastian kept getting asked what the difference is between PHPUnit and other frameworks he thought he would give a synopsis of each. He did say that he quite often saw flamewars erupting over different frameworks, which he felt was unnecessary and counter productive.
First, he took us through a quick run-through of what testing was, and what sorts of testing methodologies were available. The two main types of testing are dynamic and static testing. Dynamic testing involves executing the software to make sure it does what it is meant to do, but also how fast and scalable it is. Static testing involves testing the software by simply inspecting the source code.
Black box and white box testing are terms used to describe how much visibility is available to the tester. Black box testing means that the tester has no access or prior knowledge of the software they are testing. This involves things like testing to a specification, exploratory testing and security testing. White box testing is when the tester has full access to the source code and includes API testing to exercise code, code coverage and other software metrics and static testing.
Sebastian also mentioned something called mutation testing. This is a form of white box testing, which involves intentionally changing the source code to see how stable the testing is. The idea is that if your tests pass then changing a single value in the code and rerunning the tests should produce a small amount of errors. This can make weak points in the application visible.
Tests also involve a certain amount of scope. Unit tests are meant to test discrete packets of functionality within the system or just a class. Integration testing involves testing the application as a whole or with third party systems. System testing involves testing the application as a whole, from the front end.
The objectives of testing are either acceptance testing i.e. "the code I am working on now is working" or regression testing i.e. "there is a bug, so I'll write a test that tests the expected output so I know when I have fixed it".
Before moving onto the testing frameworks available, Sebastian quickly said that anything he says should be taken with a grain of salt, or at least a spoonful of salt. I thought this was a good caveat as I'm sure he would be biased in some respects as he has spent the past 10 years working on PHPUnit.
PHPT is the simplest testing framework and is used for testing PHP itself. I have actually written these tests when testing PHP at community events and remember using run-tests.php to get everything working. It is very limited in what it can do, and doesn't produce very good output, or valuable code coverage reports. PHPUnit can run tests in PHPT format using 'phpunit test.phpt', which was added by Sebastian as he didn't want to use run-tests.php to test PHP.
SimpleTest is a framework that uses a class to wrap the tests and runs the test by invoking PHP. This is in contrast to PHPUnit, which has its own command line tool. I have used SimpleTest in the Drupal world and so am quite familiar with it.
Sebastian said there had recently been a few new testing frameworks, of which a project called Atoum stood out as new and interesting. Most of the documentation on Atoum is in French, and it is therefore not all that easy to figure out what is going on. The framework forces you to use namespaces or it will crash quite heavily. The use of assertions are a little bit difficult, but it is down to personal taste really. One thing of note is that (with Xdebug installed) code coverage is calculated along the way and reported on the command line, which isn't all that useful really.
Mocking objects is possible in all of the frameworks on offer. PHPUnit, SimpleTest and Atoum all do it out of the box, PHPUnit also can interface with Mockery and Phake to create mocks in different ways. Apparently it is possible to use all three PHPUnit methods at the same time, although it would be very confusing. Atoum is a bit different (and therefore more powerful) in the way that it mocks objects as it creates a new mock object for every test run.
One plugin that can be used with PHPUnit is vfsStream. This plugin creates a virtual file system that is all in memory, which allows the speedy testing of file system tests to be run.
Specification testing is where you run your unit tests and check them off against a set of specifications. PHPUnit does this out of the box using the flag 'phpunit --testdox', although this is quite basic.
There are a few new frameworks available that go a bit further with specification testing.
Behat is a behaviour testing framework in which you set up features with scenarios. These are all written in plain text, but there is also the need to write context code so that BeHat can figure out what is going on. Sebastian did mention that the context code can get so complex that it is necessary to test it using a unit testing framework.
PHPSpec is another behavioural testing framework, but doesn't use natural language. A few code examples were shown to give us an idea about what the framework did.
For a couple of years PHPUnit had an extension called PHPUnit_Story, which implemented a scenario (rather like a test) and you would write some contexts to help the story along. However, this extension hasn't been maintained recently, and due to recent new changes in PHPUnit it is now broken.
System level testing involves testing the application from the 'front end' and has become mainstream in recent years, and this is because there are a number of tools to accomplish this.
A combination of Behat, Mink and one or more of Groutte, Sahi, Zombie.js can system test an application. Because Behat uses natural language the writing of the tests can be left to a non-developer. You still need to trust the non-developers to come up with a set of scenarios that describe how the system works. There must be an agreed common vocabulary for the application, decided upon before any tests are written.
Selenium IDE is a system that allows you to system test an application via a number of browser plugins. Sebastian did say that there was an extension for PHPUnit called PHPUnit_Selenium, but that this wasn't to be used as it was being dropped from the next release. The functionality was written one evening when bored in Norway and can't run the tests in parallel or using multiple browsers.
Profiling is also an important part of testing an application, especially one that needs to scale to a large number of users. You will want to make sure that your application returns pages quickly. Xdebug is a common tool in the world of unit testing as it allows many unit testing frameworks to create code coverage reports. Using Xdebug and a tool like KCacheGrind can give you a nice indication of which parts of the application are causing bottlenecks.
There is a tiny Python script called gprof2dot.py that takes the compiler information (from the Xdebug profiler) and will create a better set of graphs than KCacheGrind.
XHProf is a newer profiler extension, which has a very small footprint and can therefore be used on live environments. XHProf can apparently be used with PHPUnit in a combined unit testing and performance analysis approach. Load testing can be achieved via the use of tools like Pylot and Grinder, which is a good way of seeing what will fall over when the system is under strain.
I haven't seen Sebastian talk before (except briefly at the European PHPUnconference in January) and although he has a relaxed and friendly talking style it is clear that he knows exactly what he is talking about. I thought his review of other testing frameworks was fair.
Estimation Or 'How To Dig Your Own Grave'
Rowan Merewood (@rowan_m)
Rowan started about 15 minutes late due to technical difficulties and even had to swap computer and employ the keynote speaker into moving on the slides to get his presentation working. Once up and running, however, he gave a simply exceptional talk that was full of jokes and had lots of advice clearly gained from real world experiences. It was the most entertaining and valuable talk I saw all weekend.
Photo by Stuart Herbert
He first took us through some of the mistakes that he has seen (and made) when making estimates.
Sales create estimates - It is important to use the skills on your team correctly, sales have their own function within that unit but they shouldn't be expected (or allowed) to create estimates. It is usually a good idea to send along a developer with the sales people when pitching to clients, just to keep things sane.
With one man bands who do the sales and the development on their own it is important to watch what hat they are wearing.
Rowan also made a good point about optimism at the start of the project. It is easy to underestimate things that you might have enthusiasm for. It's just important to remember that in 6 months time that same level of optimism might not be there.
Lone developer estimates - Every time a developer does a set of estimations it is essential that either another dev does the same estimates, or that the estimates are reviewed. This challenges opinions and helps keep estimates more accurate.
Estimate from detailed task lists - It is easy to get a big list of detailed requirements and just go through them saying how long each will take. The only thing is that this list will change (change is the only constant) so the estimate you produce will need to be constantly tweaked. These lists also give a misplaced sense of confidence and encourages micro-management from project managers.
Estimate a day as 8 hours - No one person or team is a code machine so during an 8 hour working day the chances are that you are not going to get 8 hours of useful usable code out the door. Therefore, it is impossible to say that a 16 hour task will fit into exactly 2 days. There are lots of factors like office admin, answering emails, dealing with issues on other projects, helping other developers or even just plain old research that influence how much work you get done. In Rowan's opinion a developer will get about 6 hours of coding done, whereas a lead developer will get about 4 hours of coding done in a day.
With regards to estimates surrounding skill sets of the developers it is a good idea to add something like a golfing handicap, or perhaps a multiplication factor based on skills.
Estimate in hours - If you write your estimates to the hour you are still providing too much detail, no-one can (or should) provide that level of detail when estimating. Rowan suggested that we should be using 1/4 day blocks to estimate tasks, incrementing to 1/2 day, 1 day, 2 days and 3 days blocks. If you have a task that is going to take longer than 3 days then it should be split into smaller tasks and the same approach applied to it. What you should ask is something along the lines of "if I got this task in the morning, could I have it done by lunch", which should be roughly 1/4 of a day. For bigger projects it is probably better to use larger block units like 1, 3, and 4 days.
Just estimate coding - This is something that I have definitely been guilty of. All teams have overheads in day to day tasks, as well as the inevitable writing of documentation that must occur during the project. Rowan said that his team blocks about 5% of the project time to documentation, although he did admit that this was mostly doing PHPDocs.
You should also think about your dependencies during your estimates. If you know that you have to interface with a third party system then add some time to it in order to account for things going wrong or having to do research into the system.
Estimates as commitments - An estimate should never define your schedule, instead, try to quantify the risk associated with a task. This is a difficult thing to do in itself, but try to think how difficult something might be to implement and how much extra time might be needed if things are harder than it first seemed.
Waterfall estimates are useless - A waterfall estimate is where one task descends into the next and so on, and is an important part of software engineering, regardless of the actual project methodology used. Rowan introduced me to a concept called the 'Cone of uncertainty', which apparently isn't a D&D spell. Essentially, as you make estimates they are very accurate at the start, but they get more inaccurate as time goes on.
Agile means we can't estimate - An agile software approach is often used as an excuse to drop software engineering and just start hacking the code together.
The next part of the talk was an introduction into making the client understand what the approach to the project is. Estimates are essentially a budget of time involved on the project. Three ways to allow the client to understand what is going on are the Holy Triangle, MoSCow and the Kano Model. I had heard of a couple of these before, but I think they are interesting enough to warrant discussion about them in another blog post.
If you really want to use agile project management principles on a client then you will need to earn their trust first before getting agile. It is probably best to create some form of roadmap and then move to a sprint based approach.
Another estimation trick that Rowan told us about was called 'Planning Poker' which is a method where the entire team builds the estimate. This therefore means that everyone is involved in making the commitment. This approach combines independent estimates and a review into one project estimate.
The final part of the talk was about mistakes that people make during the projects, once all of the estimates are done. These are as follows:
Lose track of time - If there is one thing that developers hate (and I can attest to this) and that is time-tracking. It is a good idea to automate time tracking if you can, or at the very least introduce them into your stand-up/scrum meetings so that you can update your burn down charts.
Estimate bugs - The problem here is that you really don't know the size of the problem before you begin. It can be that you spend 3 days to track down one bug and change a single line of code (thus making for the most expensive line of code ever written) but these situations are realistic and you can't say that it would have taken you that long beforehand.
Try to catch up - Things can often go wrong during project timescales, and when things do go wrong you should just gender up (as Rowan called it) and admit that you are wrong. If you try to work all hours to get things done then you will burn either yourself or your team out. If you work long hours of overtime then you set a precedent, after which your project managers will simply expect it.
Skip the review - A review at the end of a project is an important stage in the project lifecycle. It allows the team to learn from their mistakes so that others can avoid them. If no time is spent reviewing the project once complete then there is no flow of communication between members of the team.
If someone is leaving the company then have someone else shadow them for at least a week before they go. This is especially important if a new person has been hired to replace the person who is leaving. It is too late to do this afterwards.
At the end of the talk Rowan was asked what time he would give to unit tests in any estimate, to which he said that he allocates about 40% of the coding time to writing tests.
Are your tests really helping?
Laura Beth Denker (@elbinkin)
Laura is part of the development team behind Etsy Inc. and clearly had real world experience about testing a number of large systems. She had an enthusiastic presentation style which conveyed information well, but her presentation lacked structure. She seemed to jump back and forward through the subject matter and I found myself often wondering what context she was talking about.
Having the correct motivation for writing unit tests is important in writing decent tests. Without the correct motivation then the tests tend not to mean anything. Bad motivation would be:
- "My boss told me to"
- I'm a completist, 100% code coverage or bust!
- I just love to use test frameworks!
The correct motivation would be:
- To verify correctness of the code (i.e. that the expected matches the actual).
- To gain confidence in the application.
- Communication about how parts of the application should work or that it fills expectations required of it.
Laura said that there are three types of test that can be run on a system. These being function tests, integration tests and unit tests.
Functional tests - Functional tests use systems like Behat (or cucumber) PHPSpec or even just a keyboard and mouse to test the full application. Essentially, you will want to answer the question "does the overall product satisfy the requirements". These sort of tests work well whilst prototyping the application as they focus more on the product requirements. You should be able to take your prototype tests and apply them to the built application and see them pass.
Integration tests - Integration tests look at how different parts of the application fit together. This is especially important when accessing third party systems as they will generally be a weak point in your application. The problem here though is that third party systems will often be the weak point in the tests as well and can often cause your tests to randomly fail. Don't write tests for third party services, only tests your application at the interface.
Unit tests - Unit tests are for testing that the logic is correct in a particular function.
When writing unit tests it is important to write tests in a standard way, following certain rules, so that developers don't have to spend time digging through test files to find the right stuff. The first part of this should be a coherent naming convention. Pick a directory that the tests will sit in, which shouldn't be part of the application code, and give each file a suffix like Test.php. Matching the name of the test file to the file being tested is a good way of tracking down what code is being tested.
When creating test classes there should only be one class per file, the name of which should match the file it is testing. The test directory should contain no interfaces of abstract classes, they should be within your application. Classes should contain no private methods. Very important is not to put any control structures into test code, as soon as any logic is included then the test needs to be tested to make sure it is functioning correctly.
You can get a few software metrics from different tools to analyze your testing code to make sure things are no more complex than they need to be. These are metrics like Cyclomatic Complexity (which should be no more than 1), Nesting Level (maximum of 0) and Unnecessary Override (maximum of 0). Laura has written a PHPUnit standard for PHPCodeSniffer, which looks like it detects these things.
Other test types to think about before the application goes into production might be performance or load testing and security testing. It is also a good idea to employ some kind of monitoring and logging in the application so that problems can be detected when they arise.
Photo by Stuart Herbert
The last session before dinner (and beer) concluded with lots of thanking and some prizes being given out. I was amazed when my card was picked out of a bag and winning a book (on HTML5). The fact that I had spent much of the day sat next to the bag encouraging people to enter the draw was only a coincidence :)
Overall the first day of the conference went extremely well. I normally spend much of PHPNW sat in side tracks, but because I was helping out on the main track I found myself attending the main track more this year and therefore sitting in talks that I wouldn't necessarily have sat in. I don't mean this in a negative sense as all of the speakers I saw gave great talks and I learnt a hell of a lot. The free bar tab certainly helped dull my headache a little from a day of heavy concentration. It was good to relax and chat with some of the speakers and other attendees about everything from current computer science exam papers to the funding model of the h2g2 site.
Photo by Rob Allen
The food this year can best be described as adequate and certainly wasn't as good as last years. The lunch menu was great, but when I approached the counter for the dinner menu all that was on offer was dry rice with either an oily hotpot or a nasty looking vegetable curry. I went for the curry initially, but after a "Really?!" from the guy behind the counter I changed my mind and went for the hotpot instead. It isn't good when even the staff complain about the food! That said, the hot pot was at least edible.
I think my biggest regret of day 1 was that I was so busy doing other things all day (mainly volunteer tasks) that I didn't get to go to any of the Unconference talks. I heard from other people there that they were all excellent talks, and I even got mentioned due to a recent blog post I wrote on #! code about LiveDocx (which was cool).
Day 2 started with me doing a little room managing, which wasn't too difficult as there wasn't many people about first thing. There was a limited about of space on the Sunday, which meant that not everyone could attend. The speakers were still on top form and the talks didn't disappoint.
Feeling Secure? Notes From The Field
Paul Lemon (@anthonylime)
Paul only had 30 minutes to go through what is a massive subject, but he did say that he would only be able to cover the top two items in the OWASP security list. These are SQL injection and cross site scripting (or XSS). OWASP is the Open Web Application Security Project who compile a list of all of the things that programmers should be doing to protect their applications from having security holes. Each year they publish a list of the top security vulnerabilities found on the internet and Paul said he was surprised that SQL injection attacks were at the top of the list. It is clear from this that we aren't doing enough to educate developers to prevent it happening.
A good quote from the OWASP testing guide is "the problem of insecure software is perhaps the most important technical challenge of our time", which sums up things nicely.
SQL Injection is where a variable is passed directly from the input to the database query, which can then be used to inject extra SQL into the query. Attackers, once they find an injection point, will spend time either looking at error messages, or simply looking at page load times to try and figure out what sort of things they can do. There is also no benefit in obscuring table names as if an attacker can get into the system they can do things to find out what the table names are.
There are many mechanisms to prevent such attacks:
- Some as simple as validation can prevent your database from being compromised. If you are expecting an integer, then validate it to reject anything that isn't.
- Use a PDO or ORM system so that you can use parameterised queries.
- Set up your database permissions so that even if an attack vector is found your database user is so restricted that nothing can be done.
- Code reviews are important to spot mistakes.
- Don't be complacent! It only takes one query to be wrong for the whole system to be compromised.
Don't just validate for the user's benefit, ensure the correct type is being passed. You should also have a whitelist of input ranges and a responsible minimum and maximum.
The essential message here (again) is not to trust anything from the browser. This includes any posted form data, the query string, the current URL, any cookies that have been set, and even any HTTP headers as they can be spoofed as well. If you are using any third party services then you should also treat that as user input and validate it in the same way. If the database has been compromised then all the validation will be useless (or the cause of the problem), in this case you should ensure that you escape all output as well. Even something as simple as stripping out script elements from the output can prevent this sort of thing.
When escaping output htmlentities() should be enough. If you need to go beyond this then you can use tools like HTML Purifier, although this will have a performance hit.
Other things you can do to prevent session exploits are to roll a new session ID when the user logs in and to only transmit the session cookies over HTTPS. This means that if the user wants to grab the favicon for a site, which is usually done over HTTP, then they are not accidentally given the session cookie as well.
The essential message from this talk was to not be complacent. You should expect attacks and should monitor your site to make sure everything is as it should be. You should also keep an eye out for security news, especially the news pertaining to your system. Using unit tests can also help in checking that the input ranges don't create attack vectors. Finally, if you use build and deploy mechanisms can help prevent mistakes (such as debug scripts being uploaded to the server) and allow you to recover the code when an attack happens.
Many To Many - No Man Is An Island
Jacopo Romei (@jacoporomei)
This talk from Jacopo was an inspirational talk about professional and personal development. I later found out that he gave this talk as the keynote speech at a PHP conference in Italy, and that kind of made sense as it had that sort of feel to it.
Using examples of two very different airplane crashes he showed that expertise and social skills are closely related. Genius is overrated and is often an instance of the dunning-kruger effect. The dunning-kruger effect (which I was impressed that Jacopo knew about) states that the less you know about something the more you think you know about it.
Companies are based either on reciprocity or remuneration in terms of how they pay their developers. The times for lone coders plugging away at code all day (or night) long are long gone.
The community helps you to double check your ideas, base your job on people, hire distant people and to develop you expertise. It gives you the courage to accept feedback.
Jacopo uses and teaches the principles of extreme programming and so he is used to guiding other and working closely with other developers.
Compiling PHP - Bringing Down The Walls & Breaking Through Language Barriers
Richard Backhouse (@filthyrichard)
Jadu is a software company, who have built their own CMS called Jadu, which was built using standard PHP technologies on Linux. Richard said that although they had enough business with Linux they kept losing out on projects that had Windows (and IIS) as a fundamental requirement. This was in the days before PHP was a stable and viable solution on IIS so they didn't feed installing Jadu on IIS could be done in corporate environments. Another option was to rewrite Jadu in .NET so that it could be run on IIS without any problems, but this would mean having two code bases, which would be a nightmare to rewrite the entire application in a different code base.
Enter Phalanger. This is a PHP language compiler, written as a .NET component. It was started in 2002 and was originally sponsored by Microsoft. Phalanger basically works by adding PHP to the CLI, which is then run by the .NET process. It is able to handle the compiling of PHP as well as running PHP scripts unmodified under .NET and the running of .NET scripts and use the .NET framework from within PHP.
When evaluating Phalanger Jadu had a set of requirements that it must meet to be able to consider using. They should be able to run CMS PHP under .NET, it must be compatible with Jadu required PHP extensions and they should be able to run the system in .NET.
Phalanger can be installed with an .msi installer file. This adds a configuration handler to the .NET framework so that PHP scripts can be run. Jadu mostly worked out of the box, with only one or two lines of code needing to be changed in some CMS. Extensions are added via the web.config file. Phalanger provides a PHP.Core namespace for .NET that allows interaction with PHP scripts.
The implementation of PHP under Phalanger was a little broken, not all features are available and serialising arrays produced different results. Some of the functions that were missing were json_encode()/json_decode(), session_regenerate_id(), and md_string(). The options open to Jadu were to fix and update Phalanger, which they actually did on one or two occasions, or to provide the missing functions without the site breaking.
An idea called duck typing was introduced to allow the strict typing of .NET languages to operate safely with the loose typing of PHP and to allow integration with Visual Studio. The idea was that if it looked like a duck, and quacked like a duck, then they could at least assume they had an aquatic avian of the family anatidae. Duck typing involved masking normal objects as duck types, but this meant creating hundreds of different duck types. What Jadu did was to create a tool that automatically created the duck types for all of the classes they had. An upshot of all this is that they could then use Visual Studio with proper intellisense.
Phalanger still has some problems. There is no autoloading and no namespaces, and although it has full 5.1.6 support, it doesn't have the features available in 5.3.
Right now, about 50% of new sites are .NET sites, supported through Phalanger.
There is still a lot of stuff that Jadu don't use for various reasons. These are things like using .NET libraries from within PHP, which can't be done because it requires a PHP syntax breaking section of code, thus breaking backwards compatibility. It was important to Jadu that only one code base was used, so adding this functionality would mean having two versions of the software.
With regards to performance certain issues have been seen (e.g. when serialising) but mostly Phalanger is faster then standard PHP. It used to be a lot faster, as PHP on Windows/IIS suffered from poor performance issues, but it's still a bit faster.
In my opinion Phalanger seemed to have been built to solve a problem that might have existed a few years ago, but has been largely solved by PHP 5.3 and work on Windows integration. Since the relatively recent speed improvements of the PHP language on Windows this tool has become largely obsolete. I can see that someone might want to use .NET code from within PHP, but as this breaks PHP outside of Phalager then it doesn't seem sensible to use this when normal .NET code will do the job. However, the talk was an interesting journey through the decisions taken when looking at this tool and how Jadu became involved in it's development.
The final talk done, my final task of the conference was to drop some of the speakers off at the airport, which went without a hitch. It was awesome being able to chat to PHP professionals from across Europe, and to answer some of their questions about life in Manchester. I even got to meet Michiel Rook, who is the lead developer of Phing, which I have written about quite a bit on this blog.
Photo by ecotorch
Overall, the conference was an outstanding success. The organisers always pitch the conference to a local audience, but I always see people from all over the world come along and it is great to see lots of new faces year on year as the conference gets bigger. PHPNW always attracts an awesome set of speakers and this year was no exception. All of the speakers I saw all gave fascinating talks that either educated me or simply challenged the way I think about things. If you missed out on talks that you couldn't get to, or missed the conference entirely, then you should be aware that all of the talks (minus the unconference) were recorded and will be available (for free) on the PHPNW channel in the next few weeks. There are plenty of talks that I want to watch and didn't get the chance, so I will be watching out for them. I will post an announcement here when they start to be posted.
On a final note, I would like to thank the Jeremy Coates and his team from Magma Digital, as well as the rest organisers and the people who helped out along side me for making this years PHPNW conference the best yet. I can't wait until next year! :)