random technical thoughts from the Nominet technical team

Oracle data skew & statistics

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by jason on Mar 28th, 2008

I recently encountered a classic example of the Oracle Optimizer being fooled by data skew. James Morle, presented at Hotsos recently on how the vast majority of issues are caused by skew and latency, certainly I have seen many bad plans chosen by the Oracle optimizer that were the result of the optimizer not being aware of the distribution of the data.

So I have a view looking like the following:

 Name                       Null?    Type
--------------------------  -------   --------------
 KEY                        NOT NULL VARCHAR2(255)
 SUFFIX                     NOT NULL VARCHAR2(255)
 INSTANCE                   NOT NULL NUMBER(10)
 NS_ID                      NOT NULL NUMBER(10)
 CREATED                    NOT NULL DATE
 CREATED_BY                          VARCHAR2(200)
 REMOVED                             DATE
 REMOVED_BY                          VARCHAR2(200)

The type of query that we were having issues was of the form:

SELECT COUNT(*)
FROM NS_ON_DOMAINS
WHERE KEY = :B4
AND SUFFIX = :B3
AND INSTANCE = :B2
AND NS_ID = :B1
AND REMOVED IS NULL

First thing to bear in mind, is that key, suffix is pretty selective. ns_id, can be highly selective, but it can also be extremely unselective. I used the technique described by Greg Rahn to determine why the optimizer was choosing a particular plan that was providing a response time of the order of 3 minutes.

---------------------------------------------------------------------------------------------
| Id  | Operation   |Name|Starts|E-Rows|A-Rows|A-Time|Buffers|Reads                         |
---------------------------------------------------------------------------------------------
| 1 |SORT AGGREGATE  |  |  1 |     1 |     1 |00:02:07.29 | 19668 | 17459                   |
|*2 | TABLE ACCESS BY INDEX ROWID| NS_ON_DOMAINS|1 | 1 | 0 |00:02:07.29 |19668 |17459       |
|*3 |  INDEX RANGE SCAN          | IX_NS_ON_DOMAINS_NS | 1 |1 |49495 |00:00:00.32 |178 |177 |
---------------------------------------------------------------------------------------------

So the optimizer has chosen to use the IX_NS_ON_DOMAINS_NS with the NS_ID value, and thinks this will be highly selective. Very often this turns out to be the case, however in a particular case, this value is completely non-selective and the actual numbers of rows is quite huge.

Forcing the optimizer to use the other obvious index, the one on key, suffix, and instance we have the following:

-----------------------------------------------------------------------------
| Id  | Operation| Name |Starts|E-Rows|A-Rows|A-Time|Buffers|Reads          |
------------------------------------------------------------------------------
|  1|SORT AGGREGATE             |    |  1 |  1 | 1 |00:00:00.01 |4 |      1 |
|*2| TABLE ACCESS BY INDEX ROWID| NS_ON_DOMAINS |1 |1 |0 |00:00:00.01 |4 |1 |
|*3|  INDEX RANGE SCAN  | IX_NS_ON_DOMAINS_DOMAIN |1|3|0 |00:00:00.01 |4|1  |
-----------------------------------------------------------------------------

Now the optimizer is over estimating the number of rows it thinks it will find via this access path. This tells us why for the particular values of the bind variables we were looking at, that the plan chosen by the optimizer is not the most optimal path to the data. The optimizer is being tricked by the fact we have a very large skew on this data.

In the end we decided to use a hint for this particular query to force it to use the better index.

DomainKeys signing for nominet.org.uk e-mails… not just yet.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by dmitri on Mar 28th, 2008

I was set a task to deploy DomainKeys signing for @nominet.org.uk e-mails so that any messages, which appear to be coming from @nominet.org.uk but not signed with our key, would be treated as suspicious. At first glance it appeared to be quite simple:
1. Generate a private/public key pair.
2. Configure our mail servers to sign outgoing e-mails with the private key.
3. Publish the public key in the nominet.org.uk zone.
And that’s done! Not quite.

We have some auxiliary mail servers serving nominet.org.uk subdomains, e.g. lists.nominet.org.uk (which is not delegated), where we cannot deploy DomainKeys signing just yet. After reading rfc4870 I realized that a granular DomainKeys signing policy published in DNS would be just what we wanted. So my thought was to publish a policy like this:

1. any e-mails coming from @nominet.org.uk MUST be signed.
2. any e-mails coming from @subdomain.nominet.org.uk MAY be signed.

So real records in nominet.org.uk zone with lists.nominet.org.uk example would look like this:

_domainkey IN TXT “o=-”
_domainkey.lists IN TXT “o=~”

Here I bumped into a problem. Nowhere in rfc4870 it was specified that MTAs MUST look up a subdomain _domainkey policy so I was not sure that all MTA implementations wouldn’t just lookup _domainkey.nominet.org.uk policy for @lists.nominet.org.uk e-mails and would lookup _domainkey.lists.nominet.org.uk as well. As result I could not be sure that all MTAs would read our DomainKey policy correctly.
And at that point I was told that rfc4870 had been obsoleted by rfc4871 and something important about signing policies had changed.

So, as of now, nominet.org.uk e-mails are not being signed yet and I am back reading RFCs, i.e. rfc4871. I hope I read the right RFC this time.

Joomla, only half a CMS

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by jay on Mar 27th, 2008

I’m building a web site for my village and I decided to use a new CMS for some fun.  The two apparently popular choices were Joomla and Drupal, both PHP based with similar funtionality on the surface.  Between the two Joomla is regarded as the easier of the two so I decided to start with that.  In hindsight this was a mistake: Joomla is only half a CMS, as I hope to explain.

As an aside, all the documentation I have read on Joomla is pretty poor at explaining how it all fits together so this post also acts a decent basic guide.

Content structure

Articles are the basic text of web pages and are displayed in the main body of the page.  When you write one it can be either uncategorised or placed inside the content structure.  The content structure is made up of sections at the top level and categories within the sections.   Yes that’s right, Joomla imposes a two level content structure, no more and no less.

Sections and categories are used in various mechanisms to display content, in particular in menus.  For example you can easily show a list of the section or categories or documents in a category. However they are not part of the navigation structure, that is entirely independent.

Site structure

The site structure is the hierarchy within the URLs for your site and in Joomla this is created entirely by your menus.  Your location on the site is always within a specific menu path.  When you move to another part of the site you always move menu path at the same time.

Breadcrumbs always show your current menu path.  If you have friendly URLs configured then these are generated from the menu structure.

It is possible to simulate a menu system that is entirely independent of the site structure by creating a hidden set of menus for the site structure and using the visible menus as redirections into this hidden set of menus.

Interestingly URLs are generated entirely from menus.  Each menu item has an alias and that is used for the URL segment created by that menu.  You can’t assign an alias to an article that is used whenever that is displayed.   This also means that if you have a menu item that displays a list of articles then the URLs for those cannot be search engine friendly.  You simply have to use menus whenever you want a URL like that.

Templates

Templates define the look and feel of the site and you can find a wide variety of free ones out there.  Switching between them is pretty easy.

Modules

Modules generate blocks of content.  Each module is a specific instance of a type of module with individual parameters.  The built-in types of module include breadcrumbs (called mod_breadcrumbs), banners (called mod_banners) and footer (called mod_footer).  You can create any number of modules of the same type each with different parameters.

What appears where on the page

Each template has different areas for content on it called positions.  For example the template ‘beez’ defines the following set of positions:

left
right
top
breadcrumb
user1
user2
user3
user4
debug
syndicate

These positions are defined for each template in the file templateDetails.xml.  Within the Template Manager you can select Preview to see where the various positions are on a page.

Interestingly these positions do not include the central area of the page that the article displays in, they only define the areas around it.  This is one of the “half a CMS” features of Joomla.

The Module Manager is used to put modules into different positions.  You can assign more than one module to the same position and the order field is used to determine the order in which they appear.  For example the position ‘left’ is often used for the left-hand column of a three-column layout and multiple menu modules are commonly assigned to this position.  The first one appears at the top, then the second below it and so on.

For each module you can also select on what pages it appears.  This is done within the Module Manager and is configured by selecting within which menu items this modules appears.

Suppose you want a number of modules to appear in the right hand column of a three-column layout on the front page but the rest of the pages not to have right-hand column then you configure the modules to only appear on the home page and none of the menus.  They then effectively disappear when you leave the front page.

In some other CMS you can have one article replace itself by any other article and keep the same menu path, which would mean that a particular content item might appear in any menu path and theoretically different modules would be visible each time.  However within Joomla one article cannot replace itself with another.  Instead all one article can do is link to another by specifying the full URL including, by definition, the menu.  This means that the modules that are visible for any article are always determined by where it appears in the menu structure.

If you wanted to have one article appear in two ways, with different modules appearing in each way then you need to have two different menu paths for the same article with the modules set to appear differently for each menu context.

Main body of the page

The main body of the page is different from the positions explained above.  It is its own unique area.  The way this main body of the page looks is determined by the menu item that is used to access it.  Remember that even if you link from another page, it still goes via the menus so this always holds true.

Joomla uses the term ‘content layouts’ for the different ways the main body can be displayed.  The built-in content layouts include:

  • The article itself
  • A blog style layout of the articles in a particular section or category.  This normally displays one article at the top with two columns of articles below it.
  • A blog archive of the articles in a particule section or category.  This is the same as the blog style layout but includes drop downs to select articles from a specific date.
  • A list style layout of the articles in a particular section or category.  This is presented in tabular format with sortable columns.

Content layout can also be generated by components.  So if you see a Joomla plugin that describes itself as a module only then it cannot generate content for the main body, it has to have component functionality also.  The built-in components include:

  • Search form
  • Login form
  • A wrapper around some external content.  This is implemented using an iFrame.

To select any content layout you have to create a menu item and specify which type of content layout that menu item leads to.  For example if you want to create a search form then you create a menu item (called ’search’ I guess) which is configured to generate a search form content layout.  The oddity from this is if you want a link to the search form that is separate from a menu item then it has to link to a menu item path for the form, there is no other way to create the search form.

Conclusion

This poor control over the main body of the page and the weird way that menus are used for URLs are my main disappoinments with Joomla.  Had I found a number of good third party components that provide real flexibility for content layouts with good URL support then I might have been pacified.  As it is I switched to Drupal and found that light years better - but more on that later.

IntelliJ 7 for Mac still doesn’t pick up external file changes

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by matt on Mar 26th, 2008

IntelliJ, the extremely good Java IDE from Jet Brains will often fail to pick up external file changes; for example performing a subversion “svn up” on your source code.

The problem can be fixed by removing or renaming fslogger in IntelliJ’s bin directory (e.g. /Applications/IntelliJ IDEA 7.0.3.app/bin) as is well documented.

Jet Brains continue to ship this library — enabled by default — for version after version of IntelliJ. Each time I install a new version, the first thing I have to do is disable fslogger and if I forget, then I tend to end up wrongly accusing a colleague of checking-in broken code.

The fslogger library doesn’t work for anyone here as far as I am aware. Shipping a library that actually stops a product from working, when it works so well without it is something I find hard to understand. Perhaps Jet Brains would consider removing this library from the IntelliJ distribution or shipping it disabled by default.

Apache & Shared Memory

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by jason on Mar 26th, 2008

We recently had an issue whereby one of our apache servers crashed failing to clean up a shared memory segment. We were using shared memory for SSL session caching:

SSLSessionCache         shm:/var/log/httpd2/gcache(512000)

Which uses a shared memory segment:

-bash-2.05b$ sudo ipcs -a
Password:
IPC status from  as of Tue Mar 25 16:26:29 GMT 2008
T ID KEY MODE OWNER GROUP CREATOR CGROUP CBYTES QNUM QBYTES LSPID LRPID STIME  RTIME CTIME
Message Queues:
T ID KEY MODE OWNER GROUP CREATOR CGROUP NATTCH SEGSZ CPID LPID ATIME DTIME CTIME
Shared Memory:
m 2304 0x13022 --rw------- root other root other 32 512000 17975 9314 9:01:41 9:26:26 9:01:41
T    ID  KEY  MODE OWNER GROUP CREATOR CGROUP NSEMS OTIME  CTIME
Semaphores:
s 65536 0x1cbd       --ra-------     root     root     root     root   129 no-entry 13:47:45
s       1  0x1001cbd  --ra-------     root     root     root     root   128 no-entry 13:47:45

After Apache had crashed when we attempted a startup we received the following errors:

 Tue Mar 25 08:43:48 2008] [error] Cannot allocate shared memory: (17)File exists 

It was a pretty straightforward fix, all we had to do was the following (as root):

ipcrm -m 2304 

This removes the shared memory segment and Apache could be started quite happily once again.

Installing Iris Explorer on Ubuntu

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by oliver on Mar 26th, 2008

Having recently setup my new Linux box with Ubuntu 7.10 I have just installed Iris Explorer successfully (eventually).

In order to ensure Iris Explorer runs properly there’s a few additional libraries that need installing…..

The first step is to request a license key from the NAG’s support section. In order to do this you need to run the supplied key_rqst program. Although, the install of Ubuntu I am using is 64-bit and the key_rqst program is 32-bit - this needs the installation of libc-i386.

In order to actually execute Iris Explorer you also need to install the following libraries:

  • libmotif3
  • libstdc
  • libg2c0
  • libg2cDev

Finally, Iris will run with the above installed but some modules (typically display modules) will not run without the gcc libraries installed either.

DataCash: Continuous Authority and 3-D Secure, choose one

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by chris on Mar 20th, 2008

I’ve blogged before about our experience of using DataCash as a payment service provider especially with regard to the difficulties of getting 3-D Secure working correctly.

Since then, we’ve hit another difficulty. We want to implement continuous authority (CA), which in essence is like setting up a direct debit on a credit card. This is because some of our customers would like us to simply charge their credit card with the outstanding amount each month. We started implementing this and got quite a long way down the path before hitting a major obstacle.

To set up the authority you need to make an initial payment, but simply flag this as being special. This initial payment is exactly like an ordinary credit card payment, but you get back a reference you can use next time around instead of providing the customers card details. We have committed to using 3-D Secure for any cards that support it, so that automatically took place for the initial payment, which is made online. Unfortunately DataCash can’t support both CA and 3-D Secure. So all of this development work had to be shelved.

Luckily for DataCash (and unluckily for us), no-one else who does CA with 3-D Secure also supports the other services we use from DataCash (in particularly paperless direct debit). So we are still a customer.  But it does seem amazing that this support is not there, especially as 3-D Secure is slowly but surely being mandated by the big card providers.

Scribefire for Firefox - A useful blogging extension

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by graeme on Mar 20th, 2008

I’ve recently found myself using the Firefox Extension Scribefire more and more. It hides away nicely as a single button on your status bar, but when opened shows itself to be a well-functioned editor for writing blog posts. You can drag and drop text into a post, work on posts you’ve already published, and most handy for myself, choose from several different blogs to post to, across multiple blogging platforms. I’ve found it works well with WordPress.

The only proviso I’d make is that it appends a “Powered by Scribefire” message to the end of your posts. You can however turn this off. With the editor open, click on the double arrows in the top left-hand corner. In Settings > Publishing is an option to remove this. Apart from this minor annoyance, I do like to use it, it seems a little more natural and easier to compose in than WordPress’s own interface.

Keeping control of dependencies with Structure101

1 Star2 Stars3 Stars4 Stars5 Stars (5 votes, average: 4.8 out of 5)
Loading ... Loading ...
Posted by chris on Mar 20th, 2008

In the recent Jolt Awards you may (or may not) have noticed an award to Structure101 by Headway Software. We’ve been using this tool for a while now and I thought it might be worth giving my impressions. Put simply it allows you to visualise the dependencies within your Java codebase and highlights areas of complexity. It does this using a concept called ‘Excess Complexity’ which Headway Software explain on their website. It shows this complexity at every granularity from top-level package right down to method.

Obviously decisions about this sort of thing can never be entirely automated, but this tool does give you a good way to track down design ‘smells‘ in the code. It may be that the complexity is not a problem, but it certainly gives you cause to reconsider your design decisions in that area.

You may be thinking at this point “But my IDE already shows dependencies, so why would I need to pay good money for this?”. That may be true and I know for a fact that IntelliJ IDEA at least will display a grid of package dependencies for you. But if you decide that you want to unpick some circular dependencies it is very hard to see what’s causing the problem. Compare:

s101_diagram.pngidea_diagram.png

I know which of these is more intuitive. The Structure101 diagram also has the advantage that clicking on the links show exactly which piece of code is causing the dependency.

At the moment we are using the tool to keep our dependencies under control by just analyzing them and graphing the overall complexity figure on a whiteboard. This helps to motivate you to keep it down. We will be fully integrating it with our continuous integration server when we move over to a new system in the next week or so. After that we may consider using the IDE plugins to highlight design problems to individual developers as they work.

Overall, a nice piece of work to keep control of complexity and one that is now free for open source projects.

First Erlang gotcha - Variables that don’t vary and pattern matching

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by alexd on Mar 18th, 2008

Well, I had hoped that my first foray into Erlang might have got further - instead, I fell for the first trick in the book…

I wanted to write a little udp server which I could grow into something more interesting. The server side seemed to work just fine, but for some reason I simply couldn’t get the server response to reach the client. The client had code of the form :

  query(Msg) ->
    {ok, Socket} = gen_udp:open(0, [binary]),
    ok = gen_udp:send(Socket, “localhost”, 4545, term_to_binary(Msg)),
    Value = receive
                {udp, Socket, _, _, Bin} = Msg ->
                         % deal with the response message
   . . . .

Of course, the Msg variable is already defined by the time the response returns, so the first pattern in receive will never match (presuming that the response is different to the query).

I really shouldn’t be copy-pasting code (which is how Msg ended up in two places in the same function. However, I’m obviously going to have to get my eye in for this sort of thing if I’m going to be doing much with Erlang. It took me far too long to find this obvious bug!

Next »

Recent Posts

Highest Rated

Categories

Archives

Meta: