About this page
This page describes my learning experiences with PHP through the design and implementation of a Web simple application for recording income and expenditure: SCaB, the Simple Cash Book. This page does not describe how to use the SCaB application; rather, it describes how I wrote the SCaB application and why I made the design decisions I did. It is intended for readers who have some experience with programming (though not necessarily in PHP) and a basic knowledge of Web development.
The purpose of this project was entirely pragmatic, in that I wanted to create an application to track my personal and business expenditure. Writing the application in PHP gave for a good opportunity to try my hand at Web application development in general and PHP programming in particular. I did not set out to create a general-purpose Web development framework, though many of the techniques I explored while writing SCaB could conceivably be re-used in other projects. However, the project is definitely “exploratory” in nature despite its real-world requirements. In other words, there are certain features in SCaB that I could have lived without but which I implemented anyway because they allowed me to explore a particular aspect of PHP or MySQL programming. Some of these features really only exist as proofs of concept and would need to be developed further if SCaB was intended for use by a general audience. Likewise, there are probably a number of missing features that a casual user (or a professional accountant) would regard as indispensible. For example, SCaB is very much, as its name suggests, a cash book. It has no facility (as yet) for producing financial statments like Balance Sheets.
Why SCaB?!
It’s often the case that application programs are given ridiculously self-aggrandising names (like Excel or WordPerfect), which the application often fails to live up to. Or they may be given names so vague as to be utterly unmemorable (usually involving some permutation of the words “Information”, “Management” and “System”). The “official” name of the application whose development is described in this document — Simple Cash Book — probably falls squarely into the “utterly unmemorable” category. Changing SCB to SCaB seemed like a bit of fun: deliberately picking name that was anything but ridiculously self-aggrandising. Also, when the author was growing up in Dublin, to call someone a “scab” was a grave insult: a “scab” was a miser, a person who wouldn’t share their sweets or toys or money. The kind of person, in other words, who was likely to count their every penny, much as the application was designed to do. And thus was born a name that was somewhat apt (assuming one was au fait with Dublin slang) and a good deal more memorable, if perhaps a little nauseating. The author apologises for any distress caused by his strange sense of humour.
Contents
Background
SCaB is a simple Web-based application for recording income and expenditure. It is written in PHP and uses MySQL to store its data. SCaB has been run successfully on PCs running the Linux operating system and the Apache Web server.
What is PHP?
PHP is, according to its
manual[1], a
widely-used Open Source general-purpose scripting language that is especially
suited for Web development and can be embedded into HTML
.
Although PHP has enough general purpose programming constructs to be used for generic system-administration tasks, its roots are in web development.
History of PHP
The first incarnations of what was to become PHP were written by Rasmus Lerdorf. At this point, PHP was not a programming language in its own right; rather, it was a set of Perl scripts that Lerdorf used to help him maintain his personal home page (hence the origin of the name) Later, Zeev Suraski and Andi Gutmans rewrote PHP as a standalone language. Suraski and Gutmans formed Zend Technologies, a company that produces commercial PHP products.
Features of PHP
PHP is a modern scripting language with a number of powerful features:
- Basic support for Web programming: parsing of the query string/ POST payloads; session handling
- Powerful array-handling features
- Object oriented: object-oriented constructs were added to PHP in version 3 of the language and have been improved on in each subsequent release. Version 5 added support for interfaces, public and private members and exception handling.
- Reflective and dynamic features: PHP allows programmers to examine the state of their code at runtime; for example, to get the class name of an object or to determine if it supports a particular method. Programmers can also create new functions or evaluate arbitrary expressions at runtime.
- A large list of extensions to the basic language including database connectivity for most of the popular RDBMS systems, as well as support for other technologies such as LDAP, XML processing, etc.
In the past few years, PHP has seen great success. It has come from being regarded as a rather obscure hobbyist language to a solid web development platform running some of the world’s most popular Web sites, with the support of many large companies. I believe that there a number of reasons for this:
- A C-like syntax: Although C has its detractors, by
adopting C’s syntax, PHP has allowed new PHP programmers to come from
a number of different backgrounds:
- Professional programmers used to developing in C, C++ or Java
- System administrators familiar with other C-like scripting languages such as Perl and awk
- Web designers familiar with client-side JavaScript but who might not necessarily consider themselves “programmers”
- A low barrier to entry: As Open Source software, potential users of PHP can test it out without licencing or other issues. Also, a large number of hosting services offer low-cost PHP hosting plans, another consequence of PHP’s availability under an Open Source licence. The combination of the Linux operating system, the Apache Web server, the MySQL database server and PHP is sometimes referred to as LAMP and is seen as a viable competitor to other propietary Web-development stacks.
- An “incremental” or “evolutionary” learning process: PHP does not abstract away the basic request/response nature of the HTTP protocol. This is in contrast with other Web development environments like Ruby on Rails[2] (in which Web sites are built following the traditional Model-View-Controller pattern) or ASP.NET[3] (which tries to make Web development as similar as possible to the event-driven model used in traditional desktop GUI development). PHP allows novice programmers to come up to speed at their own pace by allowing them to add interactivity piecemeal to an existing static Web site. For example, a novice PHP programmer could begin their experiments with the language by adding a footer to their home page that displays the current date. From there, they could add simple if/then logic to control the display of content on the page. Finally, as they became more confident with their abilities, they could store the content of their pages in a database. In this way, the PHP programmer’s skills evolve in a very natural way.
- Availability of documentation: many free or Open Source projects offer poor or non-existent documentation; in contrast, PHP has a good online manual[4] which is supplemented with contributions from the PHP user base, as well as a large number of PHP development sites[5].
Criticism of PHP
Although PHP has been successful, it hasn’t been without its critics.
Early versions of PHP were heavily criticised for poor security. A feature of PHP called “register globals” made PHP easier to learn for beginners but made PHP scripts vulnerable to injection attacks. Newer releases of PHP have tended to disable this feature by default.
Some criticisms are the result of design decisions made early in PHP’s history, decisions that made sense at the time but are increasingly difficult to justify: the lack of any namespacing in the extension system (which made sense when extensions were few); the assumption of an 8-bit character set (which made sense when most PHP programmers lived in the US or Western Europe); the fact that many extensions are not thread-safe (which made sense because multithreaded programming on Linux was still something of a novelty back when PHP was first written). Most of these specific issues are being addressed or will be in future versions of PHP.
PHP’s incremental philosophy, which allows new programmers to pick up the language with little effort, has had its own downsides. Many PHP programmers will have learned the language pragmatically, absorbing just enough to get the job at hand done and having little interest in building consensus on “best practices”, though these are starting to take shape with initiatives like the Zend Framework. However, I believe that there is something of an inferiority complex at work in the PHP community. Many PHP programmers are self-consciously aware that PHP is looked upon as “amateur” by the users of more so-called “professional” languages like Java and there is a strong pressure in the community for PHP to adopt features from these languages even if they don’t fit into PHP very well.
What is MySQL?
MySQL[6] is a popular Open-Source relational database management system available for Linux, Windows and other operating systems. It was written by Michael “Monty” Widenius, a Finnish programmer who works for MySQL AB, a commercial company that offers support and other pay services for the MySQL database. The current release of MySQL is version 5 but version 4 remains in widespread use. Early versions of MySQL lacked certain features found in other RDBMS systems (for example, subqueries, transactions, views, stored procedures and triggers) but these have been added in later releases; despite these shortcomings, MySQL rapidly became the database system of choice for many Open Source projects and is used by a large number of popular Web sites.
The design of SCaB
Basic design goals
These goals crystallised from hunches or gut-feelings I experienced as I explored the language, based on my own experiences with other programming languages and with other requirements. Some would be PHP-specific, while others would apply to Web application design in general. As such, they are somewhat personal and not to be regarded as being carved in stone:
- Separate markup and code: The great advantage of PHP (and other technologies that take a similar approach (e.g. ASP, JSP, Ruby on Rails) is that they allow code and HTML markup to co-exist without the excessive escaping of special characters that makes writing CGI applications in C such a pain. However, it’s still quite common to see PHP applications with large amounts of HTML markup embedded inside PHP variables. In my opinion, this defeats the purpose of using PHP in the first place; it also makes applications difficult to localise and difficult to upgrade if the design of the site changes. While writing SCaB, I have attempted to minimise the amount of embedded markup and localisable text held in hard-coded PHP or JavaScript strings, though I did not entirely eliminate all occurrences to my satisfaction.
- Fail early: I believe that a Web application should always attempt to return a valid HTML document, even in the case of errors. Returning a partially-constructed page might be confusing to a user, whether the “user” is a human being or a script. The pages in SCaB attempt to do as much validation as possible before any output is generated. This is so that they can redirect cleanly to an error page if an invalid value is supplied or some other error condition, such the database server going down, occurs.
- Use JavaScript sparingly: The recent popularity
of Web applications using AJAX
[7]
techniques to provide levels of
interactivity comparable with those of desktop applications
has re-awakened interest in JavaScript. However, many users
disable JavaScript in their browsers because of misuse by ad
companies or because of security concerns, while others use
browsers with no JavaScript capability whatsoever. For these
reasons, I believe that JavaScript should be used to provide
complementary functionality in a Web application,
rather than to provide functionality that does not work if
JavaScript is disabled.
JavaScript is used for the following in SCaB:
- Client-side validation: SCaB uses JavaScript to check that the data passes basic validation before being sent to the server; for example, by ensuring that a textbox intended for numeric input does indeed contain a number. These functions are called from a form’s submit handler and will cancel the form submission if there are any invalid fields. If JavaScript is not present or disabled, the invalid data will be caught by the server-side validation and SCaB will display an error page.
- Additional controls: if JavaScript is enabled,
certain additional controls can be added to a page when it
is loaded:
- A Select All/Select None checkbox on lists of items.
- A “Calculate” button can be displayed beside certain fields. Clicking this button will evaluate the contents of the field as a JavaScript expression. This is handy in an accounting application like SCaB as it allows the user to perform simple arithmetic without having to run an external calculator program.
- A “Calendar” button can be displayed alongside date fields. Click this button will display a popup calendar window.
- Faster deletion of records: SCaB uses a
technique called
“redirect after POST” to
prevent changes to the database being inadvertantly
duplicated. One of the disadvantages of this technique is
that it requires two HTTP requests to be made to the server
to delete a record. However, if JavaScript is enabled in
the user’s browser and the browser supports the
XmlHttpRequestobject, SCaB will delete records by sending a POST request directly to the server, thus eliminating one of the requests. This could be seen as a form of “AJAX lite”.
-
Use a declarative programming style: most
modern programming languages, PHP included, are imperative
in nature: the programmer describes (often at great length)
how the computer is to perform a particular task. In
contrast, there are programming languages that are declarative
in nature: with these languages, the computer is told what
the programmer wants and the how is left to the computer itself.
While declarative programming languages are often considered
somewhat exotic, there is a declarative language that is seen
nearly every day by application developers: SQL. With SQL, the
programmer formulates a query and it is left to the database
engine’s query analyzer to figure out what combination of
disk reads, index lookups and other jiggery-pokery is necessary
to satisfy the query.
PHP is not a declarative programming language. However, this does not prevent us from adopting a declarative style of programming where circumstances favour it.
Consider the code fragment below from SCaB’s “list transactions” page:
require_login();
require_permission('LIST_TRN');
In essence, this code states that only logged-in users — and furthermore, only those users with a particular permission — can access this page. This is far more readable than the equivalent imperative style which would probably involve an explicit
ifstatement to test if the user has logged in and if they have the correct position. Or consider the code used to read the query parameters from the same page:// Get parameters
$display = get_parameter('display', PARAM_TYPE_ENUM, array_keys($display_modes), 'weekly');
$from_date = get_parameter('from', PARAM_TYPE_DATE, timestamp_to_mysql_date(0));
$to_date = get_parameter('to', PARAM_TYPE_DATE, timestamp_to_mysql_date(time()));
$sort_field = get_parameter('sort', PARAM_TYPE_ENUM, $fields, $fields[0]);
$order = get_parameter('order', PARAM_TYPE_ENUM, $orders, $orders[0]);
$currency = get_parameter('currency', PARAM_TYPE_ENUM, array_keys($currencies), $_SESSION['default_currency']);
By writing the code in this way, we can see at a glance what type of data is expected in each parameter and and the default value to use if any parameter is omitted. This style helps to make the code self-documenting.
The declarative style is common in many modern Web development frameworks, notably Ruby on Rails.
Redirect after POST
In the original HTTP specification, it was recommended that GET requests were used to query data on the Web server and POST requests were used to modify the state of the server. Many Web applications do not follow this recommendation (which can lead to disaster: there are horror stories of Wiki-type websites destroyed by visiting search engine spiders because each page contained a “delete this page” link implemented as a GET request). Even Web sites that use GET and POST correctly can malfunction if the POST request returns a page of content. If the user moves back and forth in their browser history, the POST request can be submitted multiple times to the Web server (though most modern browsers will display a rather cryptic alert before posting the data again)
To avoid the problems with POST, some Web developers have recommended using a technique called “redirect after POST”[7]. As the name suggests, the Web application does not send any data in response to a POST request; insteads it redirects the browser to a new page. The advantage of this technique is that it doesn’t break the browser's back button or cause strange warning messages to appear.
The way SCaB implements redirect after POST is illustrated below.
Each kind of entity in the SCaB database (accounts, transactions, etc.) has a
three- or four-letter identification code (“acct” for accounts,
“trn” for transactions, etc.). By convention, the PHP script that
displays a list of a particular kind of entity will be called list_xxx.php
where xxx is the corresponding identification code. This script will
normally display a table of entities with a checkbox to allow entities to be
selected for modification and a series of action buttons to select
the kind of editing action (most entities support the creation of new entities
or modification, duplication or deletion of existing entities; some entities
allow more) is to be carried out on the selected entities. When the user
clicks the chosen action button, the browser issues a GET request for the
get_xxx.php page. This page displays the appropriate form for the
selected action, storing the URL of the calling page as a hidden field in the
form. When the user clicks the OK button to confirm their changes, the browser
issues a POST request for the put_xxx.php page. This page
updates the database appropriately and then redirects the browser back to the
stored URL.
Other design issues
Some of the ways in which SCaB developed did not come about as the results of conscious design decisions but were consequences of my own familiarity with PHP, the versions of PHP and MySQL that were current at the time and the kind of application SCaB is (that is, a small application with few complex features intended for use by a single person or a few people at the most). These are some of these kind of issues:
- No Front Controller[9]:
Some Web development frameworks (for example, the
Zend Framework[10])
make use of a pattern called a Front Controller.
A Front Controller acts as a kind of “switchboard” for a
Web application. All requests made by a browser pass through
the Front Controller which decides which piece of application
code should handle which request and hands off to that code.
Using a Front Controller can give a Web developer additional
flexibility as well as a single point to handle site-wide
concerns like access control. However, a Front Controller can
make it difficult to pin down the location of bugs and may cause
performance problems.
Because SCaB is such a simple application, it does not use a Front Controller; each resource specified by a URL in the application usually corresponds to a PHP file on the disk.
- Limited object orientation: SCaB uses few
object-oriented concepts in its design. I wrote the first versions of
SCaB in 2001, not long after PHP 4 was released. Although rudimentary
support for object oriention existed in PHP 3, PHP 4 represented a
considerable advance in this area, though the PHP 4 implementation
of objects remained somewhat idiosyncratic. Personally, I didn’t feel
there was any compelling reason to use objects, especially as
PHP’s powerful array-handling means that an array can be used in
many situations that would require the use of a class in other languages.
In addition, SCaB’s conceptual model
was very relational in feel and adding any kind of Object-Relational
Mapping would have introduced redundancy in the code. Because
the only persistent state in SCaB was in the database (the first
versions of SCaB had no login protection and didn’t use persistent
sessions), there were no long-lived objects like there would be in,
say, a J2EE application. As a result, there is only a single class in
SCaB: the
db_connectionclass used to wrap the connection to the MySQL database.
Application data model
The SCaB database can contain multiple account sets. Generally, a user will be working with a single set of accounts at any one time. Each set of accounts is further subdivided into account groups (for example, Income and Expenditure). The accounts themselves have transactions posted against them. SCaB is a double-entry accounts system so each transaction debits one account and credits another.
This is the basic account-handling functionality in SCaB. In addition, there are the following features:
- Each set of accounts can contain a number of transaction templates. For normal accounts, a great many of the transactions will debit and credit the same accounts. Account templates simplify the creation of these transactions.
- Each transaction can have one or more transaction labels associated with it. Labels are similar to the tags found in many online services.
- Each transaction can have one or more files attached to it; for example, when recording a large purchase in SCaB, a user could scan the receipt and attach the image to the transaction.
This is illustrated in the diagram below (some fields have been omitted from the tables for clarity)
Users and permissions
As an application intended for use by a single person or a few people at the most, SCaB has a fairly simple system for authenticating users and ensuring a user has permission to access a particular entity in the database. This system is fairly generic and could be adapted for use in other applications but is more of a proof of concept than a fully-fledged general-purpose security mechanism.
SCaB understands two levels of permissions: global permissions and set-level permissions. More levels could be added if needed — for example, down to the account or even transaction level — but I decided that that would be too fine-grained for such a simple application. These permissions are stored in two different tables in the database but are merged into a single array in the user’s session when they log in or change to another set.
As was mentioned in the basic design goals, SCaB uses a declarative approach to security. The following security functions are provided:
require_login(): checks the user has a valid login session and redirects the user to the login page if they don’t.require_permission(): checks the user has the selected permission and displays an error page if they don’t.require_all_permissions(): checks the user has all the listed permissions and displays an error page if they don’t.require_any_permission(): checks the user has at least one of the listed permissions and displays an error page if they don’t.has_permission(): returnsTRUEif the user has the selected permission,FALSEotherwise. This function is used to hide certain parts of a page from non-privileged users.
The .down file
A useful feature of any Web application is the ability to lock users out of it
so that maintenance can be carried out (for example, a database backup). This
can be done in a brute-force way by shutting down the Web server but this is
often not possible if the server is shared and is quite user-hostile.
SCaB provides a very simple way of shutting it down: if a file called
.down is placed in SCaB’s directory, any users attempting
to use the application will be redirected to a warning page indicating that the
application is offline. The existence of the .down file is
checked when the require_login() function is called so that
content that does not require an explicit login (for example, online help) would
remain accessible during the outage.
The .down file does not have to contain any information but a short
message can be included in it which will be displayed on the warning page that
the users see if they try to use SCaB while it is offline. The idea is that
system administrators would be able to run scripts like the following:
#!/bin/sh
# Backup the SCaB database
# Take SCaB offline
echo "The database is being backed up. Try again later. " > /var/www/scab/.down
# mysqldump ...
# Allow users back into SCaB
rm /var/www/scab/.down
Handling multiple selections
A set of accounts in SCaB can contain a large number of transactions, perhaps thousands in a single year. Because of this, it was important for SCaB to be able to edit multiple entities simultaneously. Also, as many transactions are similar (differing perhaps only in the date of posting), it was convenient to have a way of duplicating transactions.
The ability to edit multiple records at once was handled in the following way:
- The “list” page for a particular entity (accounts, transactions, etc.) displays a checkbox beside each record. This allows multiple records to be selected for editing or copying.
- The “get” page for the entity checks if multiple records are selected and if there are, it goes through each record checking which fields are the same in all records and which are different. Those fields that are different are replaced with a string indicating a multiple selection.
- The “get” page then displays the form for editing the records (selecting the “duplicate” action displays the same form). This form will contain an array of hidden elements for containing the primary keys of the selected records. Also, for each field in the entity, the form will have two input elements: a hidden element containing the old value and a visible element that the user can use to enter a new value for that field.
- When the user submits the form, the “put” page
for the entity checks if an “edit” or “duplicate”
action was requested. If so, it calls the
build_multi_querymethod of thedb_connectionclass. This method takes as arguments the name of the table, the list of primary keys, the list of fields being edited, the type of query desired and the POST values sent by the “get” page. The method examines the POST values and returns an UPDATE or INSERT... SELECT query for updating or duplicating the selected records.
The record duplication feature works best on tables that have an auto-incrementing primary key as it is difficult to ensure the uniqueness of the key otherwise.
Error handling and logging
Error-handling is an important part of application design. There are two kinds of error that an application can encounter:
- User error: This type of error is caused and can be rectified by the user; for example, the user enters a value for a data field that is out of the allowed range for that field.
- System error: This type of error is caused by circumstances beyond the control of the application and its developers and requires the intervention of support or maintenance personnel to fix; for example, a disk filling up or a system crash of some sort.
(This ignores the all-too-common application errors caused by programmers themselves!)
A good error-handling system must indicate to the user when they have made a mistake and how to go about fixing it. In the case of system errors, the error-handling system should attempt to store as much information about the problem as it can to help diagnose and fix the problem.User errors
SCaB has a function, show_error_page(), that redirects to an
error page. It takes an error code as an argument.
Logging the error
PHP provides a function called set_error_handler() that allows a
developer to specify a function that is called whenever an error occurs in
code. SCaB provides its own error handler which writes as much information
about the programming environment. The information is written to a file on
disk in an XML format (though it is not a valid XML because it lacks a root
element) The file is located in SCaB’s home directory. There are
scripts in SCaB to display the error log as HTML or as an
Atom[11]
feed. There is no facility within SCaB to archive or prevent the log file from
growing very large; this should be done with
logrotate[12] or similar.
The error log can contain sensitive information like the username
and password used to log onto the database because all information about the
PHP environment is stored when the error occurred. Also, because the error
handler cannot depend on the database being up when an error occurs (in fact,
the very reason that the error occurred might be the database suddenly going
down), it does not use the database-based authentication mechanism used by the
rest of SCaB. For this reason, the .htaccess file in the SCaB
distribution denies access to the “view errors” pages to
anyone and this functionality is not available in the online
demonstration. It is up to the system administrator who should be allowed
access these pages and to implement an appropriate access-control mechanism
(e.g. by using the authentication mechanisms provided by the Web server or
limiting access by IP address). Screenshots of the HTML and Atom views of the
error log are shown below. The Atom view is displayed using
Sage[13],
an RSS and Atom feed reader for Mozilla Firefox.
Known issues
SCaB will not work properly with versions of Internet Explorer earlier than
version 7. This is a result of a long-standing bug in Internet
Explorer [13] in
the way <button> elements are handled in forms. This is a
good example of a bug in SCaB that would be unforgivable in a “real” Web
application but which I was willing to live with (for the simple reason that
I don’t use Internet Explorer)
Also, SCaB insists that all dates be given in the format YYYY-MM-DD. Programmers will be familiar with dates in this format as it is similar to the ISO8601 format used by XML and other standards; however, most general users will probably find it annoying that they cannot enter dates in a more familiar format.
If the user submits a form containing data that causes the database update to fail and then clicks Back to fix the problem, the updated values they entered will be lost because the “get” page is not cached. Although most validation problems will be caught if the user has JavaScript enabled in their browser, this is not very user-friendly.
SCaB also relies on the HTTP “Referer” (sic) header to be set so that the user’s browser redirects back to the original page after adding or editing a database record. Some proxy servers remove or otherwise interfere with this header, making the experience of using SCaB from behind one of these proxies less than optimal.
Future directions
SCaB exists to serve two purposes: to help me keep track of my spending and to act as a testbed for exploring PHP. As a result, its evolution tends to be rather. However, there are a number of parts of SCaB that would make good candidates for improvement.
- Fix problem with buttons in old versions of Internet Explorer.
- There is probably a lot of redundant code in the
list_xxx.php,get_xxx.phpandput_xxx.phpscripts that could be cleaned up. A set of scripts to generate skeleton versions of these pages (similar to the “scaffold” feature in Ruby on Rails) would be a good addition. - A number of features are still missing or incomplete as they are rarely used. These include: creating and deleting account sets and groups, and editing user permissions.
- Remove all hard-coded strings from JavaScript to facilitate
localisation (for example, by using PHP’s
gettextlibrary) - Modify the code to take advantage of the new features of PHP5. For
example, it may be possible to replace the
db_connectionclass with a PDO object, thus reducing the amount of custom code in the application and making the code more database-independent. However, migration to PHP5 is impeded by the slow take-up of PHP by hosting services. - Add support for the posting of recurring transactions such as direct debits and standing orders. Much of the code for this feature already exists in SCaB but its implementation has been hampered by difficulties in providing a satisfactory user interface for defining and modifying the recurrence rules.
- Add a reporting feature to generate, for example, Income & Expenditure/ Trading Profit & Loss and Balance Sheets. This would be a good way to explore the options for generating PDF files with PHP.
References
Document history
| Version | Author | Date | Comment |
|---|---|---|---|
| 1.0 | Ken Keenan | 07 August 2007 | Initial version |