random technical thoughts from the Nominet technical team

Experiments with JTAPI - Part 1 - Making a call

1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 4.25 out of 5)
Loading ... Loading ...
Posted by ray on Jan 25th, 2008

We have a Cisco CallManager (CCM) VoIP telephone system here at Nominet and I’m currently investigating how this system might be adapted to support ENUM lookups for outbound calls so that calls can be made directly to other VoIP systems without going over the PSTN. There’s no built-in support for ENUM in CCM at all, so some sort of plugin would be required.

I’ve not been able to find any internal plugin APIs, but CCM does support most of Version 1.2 of the Java Telephony API. My hope is that this API will provide tight enough integration with the system to allow the necessary outbound call routeing.

I found that there are very few examples around of how to use JTAPI to actually do anything. Cisco does provide a sample application, but it’s unnecessarily complicated. The code below is therefore presented as a much simplified version of that application that does nothing except tell a CCM extension to dial a specified number.

To use this code you’ll need to obtain the jtapi.jar file from the CCM Administration software via Applications->Plugins->JTAPI. You’ll also need to create a new “Application User”, and add that user into the JTAPI Groups, and also grant that user access to some terminals.

With jtapi.jar in your CLASSPATH, this application can be run as:

% java MakeCall {host} {user} {password} {extension} {number}

As soon as the program has told CCM to place the call it will exit.

As this is only an example there’s only minimal input and error checking, but the application will generate an appropriate error message if you supply incorrect JTAPI login details, or if you try to trigger a call from an extension that your JTAPI user doesn’t have access rights to.

There are two specific parts of the code that deserve special mention:

  1. The com.cisco.cti.util.Condition class is not part of the JTAPI standard, but provides a useful semaphore object. In this case it’s used to make the main thread block until the ProvInServiceEv event is received by the anonymous ProviderObserver object.
  2. It appears that it’s always necessary to add a CallObserver to the Address object (even if that observer subsequently does nothing useful itself) since the JTAPI library throws an exception if you don’t. I couldn’t find this behaviour documented anywhere.

MakeCall.java

import javax.telephony.*;
import javax.telephony.events.*;
import com.cisco.cti.util.Condition;
 
public class MakeCall
{
	public MakeCall(String[] args) throws Exception
	{
		String hostname = args[0];
		String login = args[1];
		String passwd = args[2];
		String src = args[3];
		String dst = args[4];
 
 		/* start up JTAPI */
		JtapiPeer peer = JtapiPeerFactory.getJtapiPeer(null);
 
 		/* connect to the provider */
		String providerString = hostname;
		providerString += ";login=" + login;
		providerString += ";passwd=" + passwd;
		Provider provider = peer.getProvider(providerString);
 
		/* wait for it to come into service */
		final Condition	inService = new Condition();
		provider.addObserver(new ProviderObserver() {
			public void providerChangedEvent (ProvEv [] eventList) {
				if (eventList == null) return;
				for (int i = 0; i < eventList.length; ++i) {
					if (eventList[i] instanceof ProvInServiceEv) {
						inService.set();
					}
				}
			}
		});
		inService.waitTrue();
 
		/* get an object for the calling terminal */
		Address srcAddr = provider.getAddress(src);
		srcAddr.addCallObserver(new CallObserver() {
			public void callChangedEvent (CallEv [] eventList) {
				/* ignored */
			}
		});
 
		/* and make the call */
		Call call = provider.createCall();
		call.connect(srcAddr.getTerminals()[0], srcAddr, dst);
	}
 
	public static void main(String[] args) {
		try {
			new MakeCall(args);
		} catch (Exception e) {
			e.printStackTrace();
		} finally {
			System.exit(0);
		}
	}
}

Net-top-box, InfoGlue and MIME/media types…

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by ewan on Jan 23rd, 2008

We run a “digital signage” screen in our foyer - in other words it’s a large plasma screen displaying some static content (and also ‘live’ content such as TV) that is simply served from a box. The box (aka “net-top-box”) is a certain ‘NTB115′ from a company named OneLan; it runs Mandrake and has been described elsewhere as a “Mini-ITX board in a nice case with a video output card”. It has a reasonable web interface for customisation of the content layout but you can also configure it via local xml files directly.

Whilst it is reasonably flexible in displaying TV or static web content, some of it’s features are not so design friendly, such as RSS feed inclusion. With these feeds, you’re only able to style the font of the heading and subsequent description - not the width of the RSS area on the screen, nor the length of the content itself (first 200 characters only? Sorry.). Consequently if you run out of space the RSS feed simply truncates, and so you’re limited to displaying an RSS area as one long line (no line breaks) in order for it to be readable, as you can’t wrap it.

As I wanted to pull and also display (via the net-top-box) our own RSS content from our CMS (we use InfoGlue), I thought that it would be easier to ‘pull’ the content into a CMS resident HTML ‘page’, style it there and then (whilst still within the CMS), and *then* call that resulting page into the net-top-box layout as an ‘HTML content area’ (you only get two choices here - RSS their way or an HTML link). Since our RSS content is already XML, the best way to generate a styled HTML page from it is obviously to transform it via XSLT. We don’t really create a page as create a static url to a xml file which in turn references an xslt file - so the transformation (and resulting HTML) is done on the net-top-box and not within InfoGlue. There’s nothing in the manual to say you can’t do this, but also nothing to say you can, either.

Simple in theory - but it turned into a bit of a pain. It’s difficult to know (unless you’re a linux expert - I’m not, I’m a web designer) exactly how the innards of the net-top-box work as regards reading HTML straight off the bat or having to render it from a transformed XML file, but I do know it transforms it’s own XML files, so there’s a XSLT processor in there somewhere. When it came to test the files from the CMS on the box, the result on the display screen was simply a blank area, which usually denotes whitespace issues, character-encoding issues or simpy badly formed XML, but having eliminated any of these causes I was left with a CMS vs. net-top-box puzzle - eg. something in the way the files are generated and then called didn’t match.

It turned out to be the media types (after much testing) - no errors as such, but just a rather odd way they are treated within InfoGlue or within the net-top-box (exactly which system, I’m still unsure of). For an xml or xsl file to exist within InfoGlue it has to have (in our CMS structure anyway) a content-type defined - additionally a media or output method can be explicitly defined in the file itself. The RSS xml file contains a reference to <?xml-stylesheet type="text/xml" href="xslstylesheetname"?> - at this point, I expected the content-type contained within the xml file to be referencing the stylesheet as type=”text/xsl” - not type=”text/xml” - but it needs to be “text/xml” if it’s to work. The CMS at this point has this same xml file itself as being of content-type “text/xml” (which is also correct, as it’s not an XHTML file).

The xsl file has its output-method as <xsl:output method="html"/>, again as expected as the first child of the root node is going to be <html>, but the InfoGlue content-type for this file, for some reason, *has* to be “application/xml” - or it simply won’t marry the two files up.  Again, at this point, I would have expected the content-type to be type=”text/xsl” (even though “text/xsl” it is not considered to be the appropriate MIME type for XSLT files, it has to be for Internet Explorer to render it, and IE is the engine in the net-top-box for displaying content…).

So my conclusion is that it’s a MIME type issue - whether it’s the net-top-box not having the MIME types needed configured, or whether it’s an IE issue, or both, I’m not sure. Buy hey, it works, at last. ;-)

The cause of, and failure to detect, a web site outage

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by ian on Jan 18th, 2008

On 10 January 2008 our web site was unavailable for several hours. The cause for this outage, and our failure to detect it threw up some interesting points.

We have always seen examples of abuse of our systems. Usually this takes the form of high volumes of requests sent to the whois server. We respond by throttling the traffic or, in more extreme cases, blocking the originating IP address from accessing the server. In early January we became aware that two IP addresses were responsible for 68% of the bandwidth to our web site. Each address was pulling our list of tags once a second, 24 hours a day. This is the biggest page on the website. Between them the two IP addresses were responsible for more than 20GB of traffic per mont. Either one was using more than ten times the bandwidth of any other address that accessed the web site.

Our web site sits in the DMZ, outside of the firewall. We have a Juniper router sitting in front of the DMZ and use ACLs to limit access to the web site. Now I like Juniper routers, particularly the CLI. Editing the config on a Juniper, especially if you are only used to Cisco, is a pleasure. However, you do have to be aware of the consequences of your actions. When the decision was made to block these IP addresses a new term was created in the ACL to block access to the web site.

This term consisted of:

  1. Source addresses - the offending IPs
  2. Destination address - our web site
  3. Action to be taken - in this case, discard all packets received

Pretty soon after the block was imposed we were contacted by the owner of the IP addresses. It seems they intended to pull the tag list once a day and had misconfigured the script. I’m of the opinion that there is no need to pull the list like this, but we decided to remove the block anyway. The engineer who had imposed the block decided to leave the term in place, in case it needed to be re-applied. He chose to remove the source addresses only. In doing this we were left with a term that read:

  1. Destination address - the web site
  2. Discard all packets received

Which blocked any access to the web site from outside of Nominet. This is the first interesting point. The decision to leave the term there was a reasonable one, and if only one IP address had been removed then we would have been fine. [There is an option to deactivate a whole term, but he was unaware of this.] It makes me realise that we need a proper firewall for the DMZ. Router ACLs are only really applicable at layer 3.

The next interesting point is that we were unaware that the web site was not visible to the outside world. The term was removed once we were made aware of the outage, but this information came from outside of Nominet. We have a sophisticated monitoring system, based around nagios. This gives us a fully configurable and timely view of our systems, but only as seen from within Nominet. We already put our authoritative nameservers within other people’s ASes, so these would be candidate sites for monitoring stations. But one thing I want to do is make more use of things like the RIPE NCC DNSMON service. This gives us a global view of .uk authoritative nameserver availability. At present we use this on an ad hoc basis when diagnosing nameserver incidents. I want to incorporate the raw data (which we have access to) into our monitoring system to ensure we see events that would not be detected by our monitoring system. Including incidents which segmented the network, for example, where we could still see the nameserver but half the internet could not.

I would encourage anyone who can to sign up for a RIPE TTM box to increase the coverage that DNSMON has.

Hessian 3.1.3 issues

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by dan on Jan 16th, 2008

Here at Nominet we use Hessian to build our Java middle-tier services. We’ve been using version 3.0.20 for a while and, once you get used to its idiosyncrasies, it’s pretty workable and stable.

Yesterday I tried to update to the latest version, 3.1.3. Suddenly all hell broke loose. A bunch of domain entity objects we’ve been persisting through Hibernate stopped deserialising cleanly at the Java client end. A little debugging revealed that:

  • Instances of java.sql.Timestamp would drift by a few seconds from their transmitted version. We’d been using just such fields along with a @Version annotation to provide optimistic locking for Hibernate-managed entities. Lots of StaleObjectStateExceptions get thrown.

“No problem” I thought, “they’re just there because we haven’t moved them over to cleaner, less coupled Integer versioning yet. Refactor!”. Half an hour of refactoring class and table definitions later, and the local unit tests pass again. Back to service-level tests:

  • Deserialisation now fails for a couple of alternating reasons, each of which throws a HessianFieldException:
    • Enumerations randomly fail to deserialise, complaining that they couldn’t be assigned from HashSets;
    • Sets of enumerations also fail, with a root IndexOutOfBoundsException from an ArrayList.

At this point I gave up and reverted to the previous version of the library. Another developer here also had the same problems with the updated version, so I think that for now we’ll be sticking with the tried and trusted version 3.0.20. Just because a new library is there doesn’t mean you have to upgrade straight away, or even at all.

Having said that, if anyone can shed any light on these problems it would be very much appreciated…

The RSS feed for the Nominet techblog is moving

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by graeme on Jan 15th, 2008

The RSS feed for this site is moving today to: http://blog.nominet.org.uk/tech/feed/. Please redirect your feed reader to this new URL.

Web server on my mobile phone

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 3 out of 5)
Loading ... Loading ...
Posted by ian on Jan 9th, 2008

There is a lot of nice software for my Nokia N95. This week I discovered that this includes a free web server.

Registration is required, where a .mymobilesite.net domain name is chosen, and a username and password set. This step proved problematic yesterday, but has now been fixed.

Installation requires that Python for S60 be removed before the latest version of the web server is downloaded. What is not stated is that Python must be reinstalled before the server is run. If you miss this step the web server enters an endless loop. I had to power cycle the phone to get out of it. With Python installed it ran first time.

The web server is found in the Applications folder. On initial startup it prompts for the username and password chosen when registering the site. Other users can be created, with configurable access rights. It also asks whether the web server should be available over the Internet or Locally. I chose Local for initial testing. This meant I had to access it via IP. To determine your IP address just visit www.IP-adress.com [sic]. I presume running in Internet mode will make it available over DNS, though I haven’t tested this.

Entering the site as the registered user gives you access to the Contacts list, Calendar, and much more, including sending SMS via your web browser. But this is just the default; there is plenty of scope for customisation.

ORA-00600 error (arguments [13009], [5000], …) on Oracle 10.2.0.2 Database

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by patrick on Jan 7th, 2008

We experienced the following error on one of our Oracle test databases:

ORA-00600: internal error code, arguments: [13009], [5000], [1], [17], [1], [], [], []

The statement which generated this error was a simple select ... for update nowait:

SELECT rowid, key, suffix, status
FROM table1
WHERE KEY = 'example_key'
    AND suffix = 'co.uk'
    AND status = 0
FOR UPDATE NOWAIT;

ERROR at line 2:
ORA-00600: internal error code, arguments: [13009], [5000], [1], [17], [1], [], [], []

However, performing the same operation accessing the row via Oracle rowid was successful:

SELECT rowid, key, suffix, status
FROM table1
WHERE rowid = 'AAAPBRAAPAAAAI2AAk'
FOR UPDATE NOWAIT;

ROWID              KEY           SUFFIX                       STATUS
------------------ ------------- ---------- ------------------------
AAAPBRAAPAAAAI2AAk example-key   co.uk                             0

This led me to believe there must be a problem with the access method used by the first statement, probably a corruption on an index. Performing an explain plan revealed two indexes were used by the first statement:

explain plan for
SELECT rowid, key, suffix, status
FROM table1
WHERE KEY = 'example_key'
    AND suffix = 'co.uk'
    AND status = 0
FOR UPDATE NOWAIT;

---------------------------------------------------------------------------------------
| Id  | Operation                        | Name       | Rows  | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |            |     1 |    12   (9)| 00:00:01 |
|   1 |  FOR UPDATE                      |            |       |            |          |
|   2 |   BITMAP CONVERSION TO ROWIDS    |            |     1 |    12   (9)| 00:00:01 |
|   3 |    BITMAP AND                    |            |       |            |          |
|   4 |     BITMAP CONVERSION FROM ROWIDS|            |       |            |          |
|*  5 |      INDEX RANGE SCAN            | IX1_TABLE1 |     1 |     3   (0)| 00:00:01 |
|   6 |     BITMAP CONVERSION FROM ROWIDS|            |       |            |          |
|   7 |      SORT ORDER BY               |            |       |            |          |
|*  8 |       INDEX RANGE SCAN           | PK_TABLE1  |     1 |     8   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

I decided to validate the structure of the table and indexes, but this revealed no errors:

analyze table table1 validate structure cascade;
Table analyzed.

I next rebuilt the indexes used by the query:

alter index pk_table1 rebuild online;
Index altered.

alter index ix1_table1 rebuild online;
Index altered.

This resolved the problem:

SELECT rowid, key, suffix, status
FROM table1
WHERE KEY = 'example_key'
    AND suffix = 'co.uk'
    AND status = 0
FOR UPDATE NOWAIT;

ROWID              KEY           SUFFIX                       STATUS
------------------ ------------- ---------- ------------------------
AAAPBRAAPAAAAI2AAk example-key   co.uk                             0

Button placement and style from a usability perspective

1 Star2 Stars3 Stars4 Stars5 Stars (5 votes, average: 4.4 out of 5)
Loading ... Loading ...
Posted by Al on Jan 4th, 2008

While working on some simple wireframes for an upcoming development project, I was wondering whether there was much out on the web about button placement and style from a usability perspective.

Apart from information on how people tend to read web pages (in an F shape working from top to bottom, tapering to the left) and Apple’s UI guidelines (seemingly placing the most important button on the right hand side) I couldn’t find much, so thought I would post my thoughts on the matter.

The context: a form (as part of several steps) with three buttons, one that take a user to a previous step (”previous”), one that takes the user to a further step (”next”) and one that cancels out of whatever step process one is in (”cancel”). My examples below are a little contrived, but they do illustrate my thought process.

As styled below in simple form, there is little indication which button is the logical next step at a glance (based on the fact that people usually don’t read things when looking at a page). Some obvious initial problems are the buttons are too close together, and the cancel button really is a different action when compared with the other two buttons.

button styling - form 1

By moving the cancel button to the right things become a little clearer. Some people use angled brackets on buttons to indicate direction of flow, which can help visually, as people interpret symbols more intuitively than words.

button styling - form 2

This still causes more problems, as not only is the most important button in the middle, but the semantic meaning of the button is not clear. “Next” and “previous” what exactly?

Looking at the most common usage on the web (paging through search results) “next” really indicates that the user is expecting more of what is displayed currently on the page, especially when used along with “previous”.

Changing the button labels to something more meaningful like “go back” and “continue to next step” make thing a little clearer.

button styling - form 3

But based on the idea that people read a web page in an F, the buttons are still in the wrong position, as the button that comes into view first is the go back, which isn’t the most important.

If we reverse these to make the most important button first (continue to next step), we are then faced with the problem of both arrows diverting attention to the space between the buttons.

button styling - form 4

These angle brackets are most often used when paginating through similar records, but in this instance they are being used to indicate what is the next step going forward. If we replace the angle brackets with a different visual indicator, such as more emphasis on the primary button, we solve this problem.

Another benefit of putting the primary button first is that if the user hits the enter button with focus within an input box on a multi-button form, often browsers will submit the first button they come across.

button styling - form 5

Words like “next” and “previous” can be used when paginating through records along with angle brackets, but are sometimes not totally appropriate when used in other contexts.

I realise the examples above are all purposefully simplistic, and within the context of a step within several steps. On most forms on the web things are a lot simpler, so often just a single button is used, and labels like “submit” are perfectly meaningful and suitable. It’s all about considering the context.

So although buttons seem a minor detail within the greater context of a web page, taking button style and labelling into consideration is important, especially on forms with multiple buttons. By using suitable and more semantically correct labels, good visual indicators, and considering meaning within an overall context, one can much improve the usability of a form.

Oracle Linguistic Indexes

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by jason on Jan 4th, 2008

Quite a bit of synchronicity occurred recently to the dba team here at Nominet. We have been following the writings of Richard Foote and in particular an article on Linguistic Indexes. I thought the article interesting though somewhat obscure and filed it at the back of mind.

We have been working on an upgrade to our ticketing/service management system and we came across a custom written query that was performing like a dog when run by the application, but producing fast performance when run by other schema owners:

SELECT count (*) FROM   user.domains d WHERE  d.key = 'nominet';

------------------------------------------------------------------------------------
| Id  | Operation         | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                |     1 |    14 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |                |     1 |    14 |            |          |
|*  2 |   INDEX RANGE SCAN| PK_DOMAINS_IDX |     1 |    14 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------

But when run as the user the application connects as:

SELECT count (*) FROM   user.domains d WHERE  d.key = 'nominet';

--------------------------------------------------------------------------------------------
| Id  | Operation                 | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                |     1 |    14 |  5209   (3)| 00:01:03 |
|   1 |  SORT AGGREGATE           |                |     1 |    14 |            |          |
|   2 |   PX COORDINATOR          |                |       |       |            |          |
|   3 |    PX SEND QC (RANDOM)    | :TQ10000       |     1 |    14 |            |          |
|   4 |     SORT AGGREGATE        |                |     1 |    14 |            |          |
|   5 |      PX BLOCK ITERATOR    |                |     1 |    14 |  5209   (3)| 00:01:03 |
|*  6 |       INDEX FAST FULL SCAN| PK_DOMAINS_IDX |     1 |    14 |  5209   (3)| 00:01:03 |
--------------------------------------------------------------------------------------------

We had other examples where a primary key index was being ignored by the application schema user which was doing a full table scan of a large table instead. The obvious conclusion to jump to (and the immediate one which we did), is a difference in the optimizer environments for the two schemas. We checked this using V$sql_optimizer_env, but the different child cursors produced for the query had the same optimizer environment settings. I even did a 10132 event dump of running the query in the 2 schemas, but I could not for the life of me put my finger on what was producing the differing plans.

It had to be something in the environment of the application schema, so we looked at roles & privileges until eventually we looked at logon triggers, and there it was:

begin
	       execute immediate 'alter session set NLS_COMP=LINGUISTIC';
	       execute immediate 'alter session set NLS_SORT=BINARY_CI';
	    end;

There is a good chapter in the Oracle documentation on Linguistic sorting, which you can read for yourself. There are a couple of things that are surprising for me, first these nls changes to a session do not seem to get into the V$sql_optimizer_env or other exposed views of what the optimizer environment was at run time, they clearly can have a large impact on the explain plan generated. Secondly, I don’t understand what was wrong with the traditional function-based index and having UPPER(column) whenever a case-insensitive search is required.

Of course in the interest of full disclosure I should point out that full explain plan had the following hiding after the plan:

       filter(NLSSORT("KEY",'nls_sort=''BINARY_CI''')=HEXTORAW('69747600') )

Exporting Lotus Notes Calendar entries - update

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 4.5 out of 5)
Loading ... Loading ...
Posted by ian on Jan 3rd, 2008

I wrote yesterday about exporting Lotus Notes calendar entries. Since then a colleague has suggested an alternative method.

From your chosen Calendar view, such as Day, Week or Month, select events using the mouse. Then use the same export method as before to produce an iCalendar file and import into your chosen calendar application.

This method of event selection has its own problems. It does ensure that events in the iCalendar file have the same timings as in Notes, but it has problems with repeated events. For example, I have fortnightly meetings held alternate Monday mornings. When importing the iCalendar file in to Apple’s iCal application I can see one occurrence, but no repeats. This is usually (but not always) the first occurrence. Google Calendar ignores these repeated events completely.

One thing to note is that selecting everything in my Notes Calendar view doesn’t work at all. Exporting results in a file which is 1.2 MB in size, but this produces an empty calendar on import into either iCal or Google Calendar. I guess there is something in there they don’t like.

Next »

Recent Posts

Highest Rated

Categories

Archives

Meta: