random technical thoughts from the Nominet technical team

Clearing up an ORA-16011 error

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 2 out of 5)
Loading ... Loading ...
Posted by jason on Apr 30th, 2007

As mentioned previously we run our database on an Oracle RAC cluster. Recently an instance of our physical standby had a problem meaning we had to change which node of our standby cluster the managed recovery process ran on. However one node of our primary was showing the following:

SQL> SELECT DEST_ID “ID”,  STATUS “DB_status”,     
DESTINATION “Archive_dest”, ERROR “Error”
FROM V$ARCHIVE_DEST
WHERE DEST_ID < 3;
1 VALID /path/to/arhived_redo/SID/
2 VALID CROSS_INSTANCE
3 DISABLED STANDBY
ORA-16011: Archivelog Remote File Server process in Error state

Now the odd thing was that there was no connectivity issue between this node of the primary and the surviving node of the standby - I could use sqlplus to connect to the standby instance from this primary node. The 2nd node of our primary was showing dest 3 as being valid, so basically somehow this node had got confused or rather an RFS process on the standby had got confused. However I attempted to restart the standby instance but this did not clear up the ORA-16011 error. While this error was remaining the standby was showing an ever increasing transport lag so was obviously not receiving any redo from this node of the primary. To fix it up I had to do the following:

SQL> alter system set log_archive_dest_3=” scope=memory;

What you then do is to recreate your log_archive_dest exactly as you had it before. This seemed to clear everything up quite nicely and we now have our standby back up-to-date. There is not much info out there about ORA-16011, though I notice this snippet from the Oracle online documentation mentions fixing the standby but I don’t think this is correct.

When Oracle ASM goes bad

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by jason on Apr 27th, 2007

We have been using ASM with our Oracle 10.2 Linux clusters for over a year now. I have found it to be extremely stable. Until now. One of our RAC database instances crashed with the following:

Error: KGXGN aborts the instance (6) 
ORA-29702: error occurred in Cluster Group Service operation
LMON: terminating instance due to error 29702

Trying to restart the instance produced the following errors:

ORA-00202: control file: ‘+ASM/path_to_file/control02.ctl’ 
ORA-17503: ksfdopn:2 Failed to open file +asm/path_to_file/control02.ctl
ORA-15001: diskgroup “ASM” does not exist or is not mounted

Clearly for some reason there were issues for this instance accessing the datafiles. I saw that ASM was still running so I looked at the ASM instance alert logs and found the following:

ORA-00600: internal error code, arguments: [kfgFinalize_2], [], [], []
NOTE: cache dismounting group 3/0×6A02BCB9 (ASM)
ERROR: diskgroup ASM was not mounted

Looking up the ORA-600 code on metalink I came across document 418063.1 this matched exactly with what we were seeing and all the stack trace calls matched up. There was one crucial difference between our environment and that mentioned in the note and that was the fact that the note stated this was fixed in the 10.2.0.3 patchset. We are running 10.2.0.3 on this cluster so this “fix” should be helping us. I never encountered this issue in a year of running on 10.2.0.2, but now have after 7 weeks of being on the so called fixed 10.2.0.3. We still await a solution to this.

SQL*Plus batch mode trickery

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by chris on Apr 27th, 2007

In our ant build scripts we have targets to setup and teardown test databases. This is useful for our continuous integration environment. Of course we also want these same scripts to be applied to the live environment. This means that they need to be parameterized, not least so that passwords can be different between testing and production. Since we are using SQL*Plus to apply these scripts to an Oracle database, we use substitution variables. For example to connect to a schema called blah we will use the following syntax within our SQL scripts:

connect blah/&&blah_password@&&database

allowing the password and database to be parameterized. That’s fine until someone makes a typo and we get this instead:

connect blah/&&blah_apssword@&&database

If you are running interactively, the script will stop at this point and ask you for the value of blah_apssword. But when you run via ant, you find that it gives the following cryptic message and keeps on going regardless:

Enter value for blah_apssword:
User requested Interrupt or EOF detected.

We’d rather that it failed at this point. So let’s tell SQL*Plus to fail on error:

WHENEVER SQLERROR EXIT SQL.SQLCODE
connect blah/&&blah_apssword@&&database

This doesn’t quite work because the failure to connect is a SQL*Plus error, not a SQL one. The solution is to pipe something into STDIN so that when SQL*Plus hits an unknown variable it reads this in and uses it. I opted to pass in a single dot. The script is being called by ant’s exec task, so we can set the input using the inputstring attribute. This gives us what we want:

Enter value for blah_apssword: ERROR:
ORA-01017: invalid username/password; logon denied

So now our script will fail when we hit an unknown substitution variable (so long as a single dot is not a valid value for it). Any better solutions would be most welcome…

Expanding a LUN on Solaris x86 - fdisk problems

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 3.5 out of 5)
Loading ... Loading ...
Posted by andyh on Apr 26th, 2007

We have some Solaris 10 x86 servers connected to one of our our SANs and wanted to add some extra disk space. This is normally straightforward on Sun Sparc servers, however we ran into a few problems on the x86 servers. Format would not show the correct LUN size, even after trying to auto-configure the device type. Turns out we had forgotten that we were using x86 servers and that fdisk was needed to increase the LUN size before going into format.

             Total disk size is 32635 cylinders
             Cylinder size is 16065 (512 byte) blocks

                                               Cylinders
      Partition   Status    Type          Start   End   Length    %
      =========   ======    ============  =====   ===   ======   ===
          1       Active    Solaris2          1  32634    32634    100

SELECT ONE OF THE FOLLOWING:
   1. Create a partition
   2. Specify the active partition
   3. Delete a partition
   4. Change between Solaris and Solaris2 Partition IDs
   5. Exit (update disk configuration and exit)
   6. Cancel (exit without updating disk configuration)
Enter Selection: 6

At this point we tried a few things and one of them was to delete all partitions and save that. Big mistake. We couldn’t get back into fdisk.

$ sudo fdisk /dev/rdsk/c4t0d0p0
fdisk: Cannot open device /dev/rdsk/c4t0d0p0.

Not good. How could we get back into the disk. We tried putting a new boot block on the device

 $ sudo fdisk -b /usr/lib/fs/ufs/mboot /dev/rdsk/c4t0d0p0
fdisk: Cannot open device /dev/rdsk/c4t0d0p0.

Still no joy. Finally we had to create a file with format similar to the output from ptrvtoc and then use that to partition the disk.

 $ cat /path_to_file
* Label geometry for device /dev/rdsk/c4t0d0p0
* PCYL     NCYL     ACYL     BCYL     NHEAD NSECT SECSIZ
  16383998    16383998     2        0        1     2150 512
$ sudo fdisk  -S /path_to_file -I /dev/rdsk/c4t0d0p0

And finally we could get back into fdisk, delete all the partitions and create just one partition using the whole disk. Then the normal format command to partition the disk correctly and make filesystems, mount etc.

The easy route next time - never delete all the fdisk partitions and save/exit from fdisk!

VimOutliner for beginners

1 Star2 Stars3 Stars4 Stars5 Stars (3 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by chris on Apr 25th, 2007

While looking for some outlining software, I came across VimOutliner (Warning: the comments system on the site seems to have been completely taken over by pornographic spam, don’t say I didn’t warn you). This is a simple plugin for Vim which gives you outlining capabilities within the editor. It may not be as slick as some of the other tools out there, but it does what you need and has the advantage of operating on plain old text files. The only problem was that I didn’t find the documentation very helpful. What I wanted was something to get me started, along the lines of “So you’ve installed VimOutliner, here’s how to use it….”. I experimented and eventually found how to use it. Looking at the help file again, everything is there, it just isn’t obvious where to start. So here’s my attempt at such a document. Continue Reading »

Fast developer database refreshing using physical standby

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by jason on Apr 25th, 2007

It is quite commonplace for developers to want to prototype their applications against a full copy of live. One popular method for doing this is using snapshot technology. A good example of this is from Alex Gorbachev. One advantage of snapshots is the potential for using far less storage space than what a full copy would require. However, I am less in favour of this as I fear the additional I/O that may be generated on your production storage array and I like to keep prod and development instances well physically seperated. We have a seperate EMC storage array for our developers with around 10TB available to play with. This means we can easily full copies of live, as our production database is around 130GB in size.

The one serious disadvantage we find with refreshing developers test databases is the time to perform a full restore of production from backups. While this is a good test of our ability to recover from our backup tapes, the developers can get frustrated at the amount of time required for a refresh. Currently it takes around 3 hours to perform a full restore and I did not consider this to be an acceptable turn around time.

Looking around for something to accelerate our database refresh time I came across EMC Snapview we are using this to take full clones of luns so that we can recover a copy of production from backup once and then clone multiple times, in effect giving the developers a copy of a copy of live. I then had the idea why not use a physical standby for the master lun and clone the developers copies from this.

dev_dbcloning.jpg

The big advantage with doing a clone from a physical standby is that snapview works out what blocks have changed so as there is a relatively small number of blocks being changed in comparison to the total number of blocks the database uses a refresh from the physical standby is far, far quicker than a full clone from backup tape.

Using Perl’s Inline::C to call OpenSSL’s EVP and ENGINE libraries.

1 Star2 Stars3 Stars4 Stars5 Stars (5 votes, average: 4.4 out of 5)
Loading ... Loading ...
Posted by roy on Apr 22nd, 2007

We’d like to use the SCA6000, a hardware security module from a Perl script.

The problem is that there is no perl package that combines OpenSSL’s ENGINE and EVP functionality. To route around this problem, we’ll call the necessary C library functions straight from perl, using Perl’s Inline C package. Inline::C is very useful to make add-hoc function calls to C libraries from Perl. We’ll try to give a brief explanation here.

Note that we don’t need to write any C code for this test, we just need to import C functions using prototypes from openssl/evp.h and openssl/engine.h. One thing is important to realize. In perl, there is no concept of types. A variable can transpose from integer to string to integer on the fly. This makes Perl powerful, or annoying, depending on your perspective. But, to include C code in your perl program, the concept of types rises to the surface. To call a C function from perl, the perl variable needs to be casted to a proper C type. This casting is done by using a typemap, which maps C types to perl, and vice versa.

For well known C types, such as int, short, char, etc, there exist a typemap file. A particular line in a typemap contains the C type on the left hand side, and a perl macro on the right hand side. The macro contains an INPUT section, to translate perl to C types, and an OUTPUT section to translate C types to perl. For unknown C types, like the EVP_PKEY struct, we’d have to write our own typemap. Luckily, most of these things are pointers to structs of which we don’t need to use the internals in perl, so we’ll map those to T_PTROBJ macros.

In short, it is trivial to include C code in your perl scripts if it is using well known C types. For other types, the typemap needs some extra work. That is it really. A proof of concept perl script (EVP_signer.pl) is included below. If you’re interested in what happens under the blanket, look at the example below, and move on to read perlguts, perlxs, etc.

I’ll give an example from an actual typemap file, included in any perl distro:

int	T_IV

INPUT
T_IV
	$var = (int)SvIV($arg)

OUTPUT
T_IV
	sv_setiv($arg, (IV)$var);

$var is the C variable, $arg is the perl variable.
In the INPUT section (turns perl variable to C) we see a function SvIV($arg). This gets the value of $arg. It is then casted to an integer (using (int)), and stored in $var.
The OUTPUT section (that turns C to perl) we see a function sv_setiv(). This turns an integer, stored in $var, into a perl variable in $arg.

The real typemap we use for this exercise (a perl program that uses the EVP and ENGINE function calls) is fairly straightforward:

EVP_MD_CTX *            T_PTROBJ
EVP_PKEY *              T_PTROBJ
ENGINE *                T_PTROBJ
UI_METHOD *             T_PTROBJ
EVP_MD *                T_PTROBJ
const EVP_MD *          T_PTROBJ
unsigned int*           T_PV
const void*             T_PV

The program itself is below. The code is commented inline:

#!/usr/bin/perl -w

use strict;
use subs qw/check_error_queue/;

# the SCA6000 constant indicates if we're using a SUN SCA6000 card, in which case we're using
# the pkcs11 engine, or if we're using the default openssl engine. We'll also base the path to the
# openssl library based on the SCA6000 value.

use constant SCA6000 => 0;
use constant OPENSSLPATH => SCA6000?'/opt/openssl-0.9.8d':'/usr/local/ssl';

sub main
    {
        # The user needs to specify a key identifier as argument.

    $#ARGV == 0 or die "usage: EVP_signer keyid\n";

        # all the parameters we'd like the user to specify, but we'll declare it here for now.

    my $message = "Hello World!"; # the message to be signed.
    my $key_id = $ARGV[0];        # the key identifier, derived from the argument
    my $engine_id = SCA6000?"pkcs11":"openssl"; # the engine

        # load human readable error strings.

    ERR_load_crypto_strings();

        # read the standard openssl config file

    OPENSSL_config(0); check_error_queue;

        # setup engine

    ENGINE_load_openssl(); check_error_queue;

    my $engine = ENGINE_by_id($engine_id); check_error_queue;

    ENGINE_init($engine) or check_error_queue;

        # To use the engine, we need to gain access first. The user is authenticated by a PIN.
        # This is only useful when the SCA6000 is used as an engine.

    SCA6000 and (ENGINE_ctrl_cmd_string($engine, 'PIN', 'nominet1:abc123', 0) or check_error_queue);

        # assign the private key

    my $key = ENGINE_load_private_key($engine, $key_id, UI_OpenSSL(), 0); check_error_queue;

        # create a digest context and assign the digest method

    my $ctx = EVP_MD_CTX_create(); check_error_queue;
    EVP_SignInit($ctx, EVP_sha1()) or check_error_queue;

        # hash the message into digest context

    EVP_SignUpdate($ctx, $message, length($message)) or check_error_queue;

        # setup a buffer to store the signature in. The buffer needs to be as long as the private key.

    my $sig_buflen = EVP_PKEY_size($key); check_error_queue;
    my $sig_buf = "\0" x $sig_buflen;

        # sign the hash in the digest context with our key

    EVP_SignFinal( $ctx, $sig_buf, $sig_buflen, $key) or check_error_queue;

        # print the signature in hexadecimal encoding.

    print "\n", unpack( "H*", $sig_buf), "\n";

         # clean up after us

    EVP_MD_CTX_destroy($ctx);check_error_queue;
    EVP_PKEY_free($key);check_error_queue;
    ENGINE_finish($engine);check_error_queue;
    ENGINE_free($engine);check_error_queue;
    }

main;

 #  The check_error_queue function is a wrapper around ERR_get_error() to print human
 #  readable error strings. If an error occurred, the program dies, printing
 #  the error.
 #

sub check_error_queue
    {
      my $errcode=ERR_get_error();
      my $errstr="\0"x256;
      if ($errcode)
         {
         ERR_error_string($errcode,$errstr);
         die($errstr,"\n");
         }
    }

use Inline C => DATA =>
  ENABLE => AUTOWRAP =>
  TYPEMAPS => './typemap' =>
  LIBS => '-L'.OPENSSLPATH.'/lib -lcrypto' =>
  INC => '-I'.OPENSSLPATH.'/include'

__END__
__C__
#include <openssl/evp.h>

/* Note that we only need to include the openssl/evp.h header file.
 * The others, like engine.h and err.h are included by the evp.h file
 */

void          OPENSSL_config(const char *config_name);

EVP_MD_CTX*   EVP_MD_CTX_create();
void          EVP_MD_CTX_destroy(EVP_MD_CTX* ctx);

const EVP_MD* EVP_sha1();

void          EVP_PKEY_free(EVP_PKEY* pkey);
int           EVP_PKEY_size(EVP_PKEY* pkey);

int           EVP_SignInit(EVP_MD_CTX* ctx,const EVP_MD* type);
int           EVP_SignUpdate(EVP_MD_CTX* ctx,const void* d,size_t cnt);
int           EVP_SignFinal(EVP_MD_CTX* ctx,unsigned char* md,unsigned int* s,EVP_PKEY* pkey);

int           ENGINE_init(ENGINE *e);
int           ENGINE_finish(ENGINE *e);
int           ENGINE_free(ENGINE *e);
void          ENGINE_load_openssl();

ENGINE*       ENGINE_by_id(const char *id);
EVP_PKEY*     ENGINE_load_private_key(ENGINE *e, const char *key_id, UI_METHOD *ui_method, void *callback_data);
int           ENGINE_ctrl_cmd_string(ENGINE *e, const char *cmd_name, const char *arg, int cmd_optional);
void          ERR_load_crypto_strings();
unsigned long ERR_get_error();
char*         ERR_error_string(unsigned long e,char *buf);

UI_METHOD*    UI_OpenSSL();

dnsjnio moved to sourceforge

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by alexd on Apr 19th, 2007

Now that I have finally found external interest in the dnsjnio project, it seems wise to move it to a better home (one with automatic patch support, defect tracking, mailing lists, etc.). Since the dnsjava project (for which dnsjnio is an extension) uses sourceforge, I thought I might as well put dnsjnio there as well.

So, if anybody else is interested in dnsjnio, I’d like to direct them to the new site on sourceforge from where all further communications about dnsjnio will occur.

Problems with Word on Mac

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 3 out of 5)
Loading ... Loading ...
Posted by chris on Apr 17th, 2007

I use a Mac and I also have to deal with Word documents at work, so I have Microsoft Word installed. Until recently it worked without a hitch. In fact I used to joke that it looked better and worked better than Word on Windows. Recently it started behaving rather badly. *Really* badly - not just crashing but bringing down a bunch of other applications with it. I have never seen this behaviour before on OS X. It has been suggested to me that this is down to upgrading the operating system to 10.4.9. But whatever it is, it’s not pretty.

So I have upgraded to NeoOffice 2.1. I have to say it is very nice - since version 2.0 they have made it look much more like a proper Mac application. It does a very good job and copes with the same documents that cause Word to spontaneously combust. I’d suggest giving it a try.

_nicname SRV record

1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 4.5 out of 5)
Loading ... Loading ...
Posted by jay on Apr 17th, 2007

If you want to find out the WHOIS server for a particular TLD then in many cases you can do it with a simple DNS lookup. Just query for an SRV record for the domain _nicname._tcp.tld, like this:

~ jay$ dig +short _nicname._tcp.uk srv
0 0 43 whois.nic.uk.

The answer tells you that the WHOIS server for .uk is on port 43 (as it should be) of the server whois.nic.uk.

Many other TLDs follow this convention including .au .at .dk .fr .de .hu .ie .li .lu .nl .no .re .si .se and .ch. This list has now expanded to include .us and .biz and other registries are actively considering it. Of course for gTLDs with distributed WHOIS services there may be some problems to be overcome.

However, I hope developers pick up on this and start building it into their code. At the moment most developers tend to use a service like whois-servers.net, but this mechanism means they can get the WHOIS server address directly from the registry and so users should get a better experience.

Next »

Recent Posts

Highest Rated

Categories

Archives

Meta: