random technical thoughts from the Nominet technical team

ENUM XML certificate expiry

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 5 out of 5)
Loading ... Loading ...
Posted by Anthony on Aug 25th, 2011

The ENUM registry has been live for 2 years, so we are now starting to see problems due to certificate expiry.

The certificates are stored in non human-readable form (X509 format), something like:

-----BEGIN CERTIFICATE-----
bTrz5/Otr2SRW28VzbeZ3sbJrv8gRPHCtAR4EPLdnmjE5iBaiuUTOphCBIYode7a5oeE2IAL+ggT
GB1XlaXpbM9DualnpBED59cv0PyuA99wc7lPX4t6CzaMOqgjdLH5EQiYCM2qNMGjI9Y1cF3GTmeX
baMilzXw037Dv+9uFbRBFEP+ey6cYIDxVIdZ3b3Jgs8mCtm2X9eePYYqlUMXawq511M5fgey75lz
YrwQem9EgVtIvD7QCVMp75BR2nVXyqH6tA2+oBosfVcdgHK9eJu+1KIIPofGaJDTCjx483dUe/Bn
nS82bXmaiU9+KshsOYy6VNFh4i9QSY9s7Yb9roFx
-----END CERTIFICATE-----

so it’s not straightforward to see whether a cert has expired.

You can see the expiry date using openssl, e.g.

$ openssl x509 -in testVA_cert.pem -dates -noout
notBefore=Mar  8 10:10:03 2011 GMT
notAfter=Mar  7 10:10:03 2013 GMT

or you can view everything with:

$ openssl x509 -in testVA_cert.pem -text -noout
...

See also:

Enabling SSL / https and disabling http in OBIEE 11.1.1.5

1 Star2 Stars3 Stars4 Stars5 Stars (3 votes, average: 3.67 out of 5)
Loading ... Loading ...
Posted by arjan on Jun 17th, 2011

It seems trivial to enable the https channel in OBIEE for the server in which your OBIEE is installed. And in fact enabling SSL is trivial. Just tick the “SSL Listen Port enabled” tickbox on the Configuration=>General tab for that server (bi_server1 by default). No need to restart anything.

However the problem starts when you also want to disable the http channel which you will want in most cases where you enable https. Ticking the box and saving it still works fine, but when you try to restart the bi_server1 I encountered the error:

Server subsystem failed. Reason: java.lang.AssertionError: No replication server channel for bi_server1
java.lang.AssertionError: No replication server channel for bi_server1

My setup is not a clustered one, but by default OBIEE creates a cluster in Weblogic and assigns your bi_server1 as part of the cluster

I did not have this issue in 11.1.1.3, but it has now appeared in 11.1.5 for me. After trying different things I started looking under Environment=>Clusters=>bi_cluster=>Replication to see whether there is a “Replication Channel”. There is one by default (”ReplicationChannel”). Looking at the rest of the options on that Tab I found the tickbox “Secure Replication Enabled”. Ticking this box solved the problem for me. The bi_server1 starts normally again now with only the HTTPS listen port enabled.

My theory is that since bi_server1 is part of a cluster and wants to replicate its session state to other parts of the cluster it needs to be able to communicate with other servers via this ReplicationChannel. Since the HTTP Listen port is disabled, the only port left is an HTTPS port, but by default there is no channel for it. During startup we can see the message:

<Waiting to synchronize with other running members of bi_cluster.>

Right after that the above error appears, so somehow communication fails or there is a prerequisite check failing even though there is no other server to replicate sessions with.

After some digging in the documentation I found this here http://download.oracle.com/docs/cd/E21764_01/web.1111/e13701/network.htm#CNFGD167 :

 Note:

Unless specified, WebLogic Server uses the non-secure default channel for cluster communication to send session information among cluster members. If you disable the non-secure channel, there is no other channel available by default for the non-secure communication of cluster session information. To address this, you can:

Anyway, I hope that helps when you encounter this error.

How to corrupt your Data Dictionary with Oracle Streams

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by arjan on Jun 3rd, 2011

Hopefully this blog will save you some trouble if you are working with Oracle Streams. If you do what I did on a production database you will end up with a corrupt Data Dictionary. Luckily I didn’t encounter this issue in production, but in development.

I was preparing our target data warehouse on a 11.2.0.2 development database (Linux x86_64)  following Note 753158.1 from “My Oracle Support”. It shows how to configure your database for downstream data capture. Basically one creates a tablespace that is the default tablespace for the Streams Administrator and you create a user that becomes the Streams Administrator.

create tablespace bi_streams datafile ‘&&datafile_dir.bi_streams01.dbf’
size 256m autoextend on next 10m maxsize unlimited;

exec dbms_logmnr_d.set_tablespace (’bi_streams’);

drop user bi_streams_admin cascade;

create user bi_streams_admin identified by pwd
default tablespace bi_streams
temporary tablespace temp
quota unlimited on bi_streams;
Notice the call to dbms_logmnr_d.set_tablespace (’bi_streams’);. The documentation tells us this about the function of that procedure:

The SET_TABLESPACE procedure re-creates all LogMiner tables in an alternate tablespace.

Let’s say your scripting this configuration and something went wrong in a later stage. Now you want to retry this configuration and you drop the tablespace you mentioned in the call to dbms_logmnr_d.set_tablespace (’bi_streams’); to start with a clean slate. This causes trouble though and potentially a lot of trouble. The next call to dbms_logmnr_d.set_tablespace (’bi_streams’); fails with this error.

ERROR at line 1:
ORA-04063: package body “SYS.DBMS_LOGMNR_INTERNAL” has errors
ORA-06508: PL/SQL: could not find program unit being called:
“SYS.DBMS_LOGMNR_INTERNAL”
ORA-06512: at “SYS.DBMS_LOGMNR_D”, line 135
ORA-06512: at line 1

On investigation I found a list of over 100 invalid objects all related to streams and log mining. This probably makes senses as the tablespace that I dropped may have contained objects that Data Dictionary objects or on which they are based. So running utlrp.sql didn’t help anymore.

I opened an SR with Oracle and they advised to drop all  the invalid objects and rerun catalog.sql and catproc.sql, however that didn’t help either. Oracle then advised/asked whether the database could be recreated or restored. Luckily this was still a test environment that was empty, so that was no problem. However, imagine you do this on a Live database with 24×7 uptime requirements. You will then be lucky if you have a backup of that tablespace and can restore it, validating the objects again. Otherwise you could be looking at restoring your production database.

I decided to protect myself against this issue by building the Logminer Data Dictionary in the SYSAUX tablespace which it does by default. I am much less likely to try dropping that tablespace. In fact, this is only possible when the database has been opened in MIGRATE mode. It protects me from corrupting the Data Dictionary and I don’t have to call dbms_logmnr_d.set_tablespace.

What if the LogMiner Data Dictionary is already in a custom tablespace that you now want to drop ? You can actually call DBMS_LOGMNR_D.SET_TABLESPACE(’SYSAUX’) and this will  move the existing objects to the SYSAUX tablespace. That enables you to drop the previous tablespace without corrupting the Data Dictionary.

DNSSEC incident report

1 Star2 Stars3 Stars4 Stars5 Stars (6 votes, average: 2.5 out of 5)
Loading ... Loading ...
Posted by Simon McCalla on Sep 24th, 2010

We had an incident two weeks ago with our DNSSEC signing system causing it to accidentally release a new Zone-signing-key into our live zone file. We have spent the last two weeks looking into how this may have happened and have produced a report (attached) that helps explain what we have found and some of the new procedures we have put in place to prevent a re-occurrence.

DNSSEC incident report (PDF)

Many thanks to those of you who helped us in both diagnosing the incident and making some really useful suggestions as to causes and procedure changes. Your input was enormously helpful.

Simon.

Verifying ENUM signatures

1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 3.75 out of 5)
Loading ... Loading ...
Posted by Anthony on Aug 17th, 2010

When an ENUM user sends us a Create command, we validate the XML against the schemas and that the XML signature chain of trust to our CA is OK. When this doesn’t work, there isn’t much feedback that we can return to the user, and it’s difficult to diagnose what caused the failure.

It’s possible to validate a signature with oXygen but all that says is “Invalid Signature” if there’s an error.

So I’ve put together some Java code which produces a bit more diagnostics; see ValidateEnumCreateJava.zip (or as a .jar file if you don’t have a Java compiler)

Before you start

I recommend doing an XML validity check first: don’t waste time trying to debug XML signature problems if you haven’t.

One way to do this is to use Sun’s Multi Schema Validator - https://msv.dev.java.net/ as suggested in the README in our schema bundles, i.e.
java -jar /path/to/msv.jsr /path/to/nom-enum-root-2.0.xsd your_file.xml

Running the ENUM signature checker

  • compile as:

    javac ValidateEnumCreate.java
  • run as:

    java ValidEnumCreate <yourfile>

or if you don’t have a Java compiler…

  • run from a .jar file:

    java -jar ValidEnumCreate.jar <yourfile>

Results:

Valid Signature

If all is well, the result should be

Signature Validated OK

The response for an invalid signature depends on what was wrong:

Bad DigestValue

If the digest is different but the signature of that digest is correct, the result will be

Signature 0 failed core validation:

Checking that the digest matches the data:
FAIL: DigestValue does not match data
    (Signature 0 ref[’0′] validity status: false)

Checking the signature of the digest:
PASS: SignatureValue verifies DigestValue
    (Signature 0 validation status: true)

This is possibly due to munging of whitespace. The signed XML is fragile and even sensitive to changes in whitespace between tags (I commented on this in an earlier blog article)

Bad Signature or certificate

If the signature is invalid or the wrong certificate is included, the results will be:

Signature 0 failed core validation:

Checking that the digest matches the data:
PASS: DigestValue matches data
    (Signature 0 ref[’0′] validity status: true)

Checking the signature of the digest:
FAIL: SignatureValue does not verify DigestValue
    (Signature 0 validation status: false)
Other errors
  • Failure to parse the XML - error message
  • Failure to decode the Digest/Signature/Certificate - Java exception + stack trace

References

http://jtute.com/java6/0904.html
http://java.sun.com/developer/technicalArticles/xml/dig_signature_api/
http://weblogs.java.net/blog/mullan/archive/2006/01/my_xml_signatur_1.html
http://weblogs.java.net/blog/2007/08/03/even-more-xml-signature-debugging-tips

DNS RFC Dependency Graphs

1 Star2 Stars3 Stars4 Stars5 Stars (7 votes, average: 4.57 out of 5)
Loading ... Loading ...
Posted by ray on May 24th, 2010

Spurred by a recent Slashdot posting, I’ve produced some graphs showing the relationships between the RFCs which define the DNS protocol.

The graphs (which are in SVG format) split the DNS-related RFCs into three groups (although some RFCs end up in more than one group):

The point of these graphs is not to show which RFCs refer to other RFCs, but to show which RFCs update or obsolete other RFCs. Hence the graphs give an “at a glance” overview of which RFCs define the DNS protocol as it is now.

Boxes in grey indicate obsoleted RFCs, and square boxes indicate Informational or Best Current Practice documents.  Hovering over a box should tell you the title of the RFC, and clicking on a box will take you to the RFC itself.

The picture below is just a low resolution sample - click on the picture or on the links above to access the scalable SVG versions.

DNS Protocol Graph

Please let me know if you believe I’ve missed anything, or miscategorised any document.

Multithreading and first come, first served

1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 5 out of 5)
Loading ... Loading ...
Posted by charles on Mar 15th, 2010

A recent query from a registrar has prompted the Registrar Systems Support team to take a close look at how our EPP system works with Nominet’s first come, first served approach. The nature of our EPP service makes this challenging to apply and it may not be applied in the way you expect.

If we look at Nominet’s EPP service, the “EPP server” itself is actually multiple load balanced servers, operating multithreaded processes. These communicate with xml translation hardware also through load balancers, and also communicate with a database. A combination of factors can cause a difference in the sequence that transactions are acted upon. These include:

  1. which EPP server the request goes to
  2. thread scheduling (at the operating system level) in the specific EPP server
  3. which piece of hardware the xml load balancers select
  4. scheduling of translation requests within the xml hardware
  5. internal scheduling within the database

When you look at how EPP handles requests over “large” periods of time, EPP is clearly a first come, first served system. However because of the nature of multithreaded systems, it is not feasible (or desirable) to apply that principle when the period of time is a handful of milliseconds. Most of the advantages that EPP has over the Automaton stem from the fact that it is multithreaded and acts on requests in parallel.

The principle behind first come first served is that no party is given preferential treatment when registering a domain name. We do not shape our traffic and no registrar is given any sort of priority when our EPP system processes a request. We apply first come first served based on when the first valid request gets committed to the database. We must do this as we operate three different registration systems.

For registrars who work in an environment where milliseconds can mean the difference between successfully registering a domain name or not, this may be significant when deciding how your EPP client communicates with Nominet.

Can Cloud computing be a threat for security?

1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 5 out of 5)
Loading ... Loading ...
Posted by alessandro on Nov 19th, 2009

A cloud refers to “the provision of dynamically scalable and often virtualized resources as a service over the Internet” (from Wikipedia). In practice, a user that logs in a cloud service (the bottom of this page lists some of them), for a reasonable price, can rent “resources” such as disk space or virtual machines to run his own code.

Recently, I have been monitoring the queries coming to our WHOIS service  and have noticed that several requests were originated by machines belonging to the IP space of a well-known commercial cloud. Since the WHOIS is a free service and can be run from any machine, I strongly suspect this technique has been used to avoid hitting the limit of 1000 queries/day set by Nominet’s Acceptable Use Policy on a per user basis (and not per IP).

The impact of this episode, as far as I can see, is limited and, maybe, not worth too much attention. What is interesting, however, is the way the cloud has been used to circumvent Nominet’s rules. This rises questions about how easy it would be for a malicious user to exploit a cloud computing environment for illegal activities and how long shall we wait before the first large-scale attack based on this technology is reported.

If we consider how the cloud environment works, we realise that:

  • A cloud gives a malicious user access to a virtually unlimited pool of resources and computing power
  • It is difficult to enforce limits on the amount of resources a single user is allowed to control, because this would harm legimitate users, without preventing malicious ones to open multiple accounts
  • Monitoring all processes and activities that run on the cloud is quite complex, maybe impractical. Besides, I don’t think legitimate users would be happy with service providers inspecting their data. They will be forced to use cryptography, which will make things even worse
  • Assuming that a service provider could offer some level of protection from misuses of their service, malicious users could spread their activities across different cloud providers, making the task of early detection very complex.
  • Finally, accessing cloud services is cheap and prices are expected to drop with the technology behind big data centres becoming more accessible.

The security issues associated to cloud computing are not unknown (recently, for example, botnet controllers have been discovered in the Google cloud), the problem is that this kind of attacks and  the threat associated to them are likely to increase in the coming years.

Defending from a cloud-based attack might not be easy and will need to rely on the “good will” of  the cloud service providers, which will be expected to monitor their users activities. And, to cite Joze Nazario, from Arbor Networks in a recent interview to The Register, “going to a company as big as Google and saying ‘Can we get an image of that server,’ that’s a pretty high barrier”. Especially for small-medium organisations affected by a small/medium -sized attacks.

Mutation Testing with Jumble

1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 5 out of 5)
Loading ... Loading ...
Posted by chris on Sep 28th, 2009

Mutation testing is a technique for checking how good your unit tests are.  It mutates a class by for example swapping a subtraction for an addition or by negating an if statement.  It then runs the unit tests for that class.  If none of them fail, then maybe your tests are not good enough.

We took a look at Java mutation testing tools some time ago as a possible addition to our continuous integration system.  But at that time the tools fell short.  Jester works by mutating the source files.  We found that this was just too slow with all the compilation it needs to do.  Jumble is a bit smarter in that it mutates the bytecode of the compiled class files.  However, when we evaluated it only JUnit 3 tests were supported and we were already well on the way to transitioning most of our tests to JUnit 4.

Recently though, Jumble was modified to work with JUnit 4 tests, so I thought it was time to take another look at it.  I found it quite tricky to get it working as there is no ant integration and it only runs as a command line application.  However, with a bit of persuasion I managed to get it running over our codebase using ant.  The details of how I did this are given below, but I think that a better solution would definitely be a fully fledged custom ant task.

So how did I get on?  Initially I had some niggly classloader problems. It seems that you need to tell Jumble’s mutating classloader to defer the loading of various sets of classes to the default classloader.  I needed to do this for the JMX classes found under javax.management by specifying the command line flag --defer-class=javax.management. Once I’d done this I had it working and did indeed find some interesting things.  I found tests that had been cut-and-pasted and not changed to test what they claimed to test. I found some test data that wasn’t up to the job and I found an actual bug in the code.

However, I hit a roadblock once the code under test used a database.  For some reason Oracle’s JDBC driver would not behave. Even before any mutations were applied it would insist on trying to connect with a null password.  This meant that it tried a number of times before locking itself out of the database.  I assume that this is some kind of classloader thing, but it seems strange that it manages to successfully contact the database only to completely mess up the credentials.  I’ve contacted the developers of Jumble (who are based in New Zealand incidentally), but no solution has been forthcoming as yet.  Until this problem is fixed we won’t be able to add this to our continuous integration system, which is a shame, as I think it could be a useful tool.

So, how did I integrate Jumble into ant?  I used the follow macrodef to run Jumble:

<macrodef name="do-mutation">
    <attribute name="class-to-mutate"/>
    <sequential>

        <!-- Use this trick to convert a reference to a classpath to classpath as a string -->
        <property name="the-classpath-as-a-string" refid="execute-test-classpath"/>        

        <java dir="${base.directory}" classname="com.reeltwo.jumble.Jumble" fork="true">

            <!-- all we really need in this classpath is the jumble library -->
            <classpath refid="ants-own-classpath"/>

            <!-- but then it needs everything in the JVM it forks: -->
            <arg value="--classpath=${the-classpath-as-a-string}"/>
            <arg value="--exclude=equals,hashCode,toString"/>
            <arg value="--defer-class=javax.management."/>
            <arg value="@{class-to-mutate}" />
        </java>

    </sequential>

</macrodef>

To get this to work you will need the classpath used to run the tests set up with the id execute-test-classpath and the classpath used by ant to find custom tasks set up with the id ants-own-classpath. The latter will need to include the jumble jar. As you can see I have excluded the equals(), hashcode() and toString() methods from being mutated as the first two are often generated by the IDE anyway. In the case of toString, this rarely contains important logic.  As mentioned before, the JMX classes are deferred to the default classloader. You may need to add some other packages to get it to work in your environment.

The above macrodef will mutate a single class. But we want to run this across our whole code base. To do this I had to resort to using some further ant tricks in the shape of the ant-contrib library which adds additional functionality to ant. As I said before, a better solution would be to write a proper ant task. Here is how the macrodef is called to run Jumble over a whole project:

<target name="mutation.test">    

    <!-- The ant contrib jar contains the "for" task -->
    <taskdef resource="net/sf/antcontrib/antlib.xml" classpathref="ants-own-classpath"/>

    <!-- Strip off the leading directory and the .class. Then replace the slashes
         with dots and separate each one with a comma.
         Results go into the classlist property -->

    <pathconvert dirsep="." pathsep="," property="classlist">
        <mapper type="glob" from="${build.dir}/*.class" to="*"/>
        <fileset refid="classes-to-mutate"/>
    </pathconvert>

    <!-- Iterate through the comma separated list and call jumble -->
    <for list="${classlist}" param="class">
        <sequential>
            <do-mutation class-to-mutate="@{class}"/>
        </sequential>
    </for>

</target>

This takes a fileset with id classes-to-mutate which consists of classes under the directory ${build.dir}. It turns the path into a package and removes the .class from the end to get a list of classes to mutate. (NB This was written to work with Unix style paths, it may need alteration to work under Windows). Then the macrodef given previously is called for each. Note that we refer to the ants-own-classpath classpath again which must contain the ant contrib jar this time.  The code given above was put in a standard ant build file included by other projects. The fileset to mutate could then be defined in each like this:

<fileset id="classes-to-mutate" dir="${build.dir}" includes="**/*.class">
        <exclude name="**/WeWantToLeaveThisOut.class"/>
</fileset>

UPDATE: The story gets even weirder. One of the Jumble developers got back to me with a suggestion for how to fix the Oracle problem, which was to add the JDBC driver to the list of classes deferred to the parent classloader. This didn’t help, but I then discovered bizarrely that the JDBC problem goes away if you are connecting to an 11g database as opposed to a 10g one. So it means that somehow the 10g JDBC driver fails to connect to 10g when run by Jumble, but succeeds against a later version. Curiouser and curiouser…

evldns - A Framework for Light-weight DNS Servers

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 5 out of 5)
Loading ... Loading ...
Posted by ray on Aug 10th, 2009

I’ve recently written and released source code for “evldns”.

evldns is a software mashup - it takes libevent’s fast event processing code and combines it with ldns’s DNS packet handling.  It’s derived from the server-side half of libevent’s “evdns” component.

The resulting framework is particularly intended for writing servers which generate custom responses. Examples included are:

  • an AS112 server which has been benchmarked at over 60,000 queries per second on an HP DL385 server.
  • a server which responds with the IP address of the client which sent the query - this can be useful for network discovery

The framework could also be used to write a “fuzzing” DNS server - one that deliberately returns malformed responses so as to trigger and test for bugs in DNS clients.

Here’s an extract from the package’s README:

evldns works using callback functions. A list of packet matching patterns
may be registered, along with a pointer to the function that will be
invoked when each pattern is matched.

The packet match works on the usual DNS triple of (QNAME, QCLASS, QTYPE)
where QNAME may be an exact match or a wildcard, and QCLASS or QTYPE may
be “ANY”.

The callback function is passed two parameters:

void callback(struct evldns_server_request *req, void *data)

The “req” parameter contains the complete received DNS request as an
“ldns_pkt”. The callback should create a response packet and populate
“req” with that response, which may either be in raw wire format
(req->wire_response and req->wire_len) or in ldns format (req->response).

If the callback function fails to populate either of the response fields
then the evldns system will pass the received packet onto the next
matching callback.

Should no callback match then evldns will automatically generate and
return a packet with RCODE = 5 (Refused).

The “data” parameter is used to pass an additional parameter supplied when
the callback function was registered. See “mod_txtrec.c” for an example
of how “data” may be used to pass expected response data into a callback.

A complete evldns application requires just a few lines of code:

event_init(); /* initialise libevent */
evldns_init(); /* initialise evldns */

/* create an evldns server context */
struct evldns_server *server = evldns_add_server();

/* register a UDP socket with evldns */
evldns_add_server_port(server, bind_to_udp4_port(53));

/* register callbacks here */
evldns_add_callback(server, qname, qclass, qtype, callback, data);
...

/* and set libevent running */
event_dispatch();

Please see the project home page for more information. There is also a Google hosted discussion group.

Ray Bellis, Advanced Projects Team

Next »

Recent Posts

Highest Rated

Categories

Archives

Meta: