random technical thoughts from the Nominet technical team

Encountering an Oracle CRS-0213 error

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by jason on Feb 20th, 2008

This is really just a quick post about the cause of the Oracle error CRS-0213 which I encountered recently and found while doing a google search as well as looking this error up on metalink that there was very little information about this error out there, and nothing that actually helped me solve the problem.

I encountered this error as I was upgrading a RAC database from 10.2.0.3 to 11.1.0.6 and I was attempting to ensure CRS used the new ORACLE_HOME environment, i.e. the 11g one rather than the old 10g one. To do this for the listener you do the following:

/opt/oracle/product/crs/bin/srvctl remove listener -n nodename

/opt/oracle/product/crs/bin/srvctl add listener -n nodename -o NEW ORACLE_HOME

Note this is using the CRS_HOME rather than the ORACLE_HOME. However, when I ran the add part I encountered the following error:

[oracle@linuxrac1 ~]$ /opt/oracle/product/crs/bin/srvctl add listener -n linuxrac1 -o /opt/oracle/product/11.1.0
CRS-0213: Could not register resource 'ora.linuxrac1.LISTENER_LINUXRAC1.lsnr'.

Now the solution to this problem is not exactly earth shattering, but as I said I thought I would blog it anyway as CRS-0213 is not all well known and it might help someone out of a hole. We had the incorrect ORACLE_HOME, in true OFA style I included db_1 onto the end of the 11g ORACLE_HOME, so the above should read -o /opt/oracle/product/11.1.0/db_1. Running with the correct home works fine.

With having 3 separate Oracle software directories now, one for CRS, one for ASM and one for the RDBMS instance, I highly recommend you all look at implementing OFA if you do not already do so!

Cloning an ORACLE_HOME

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by jason on Feb 8th, 2008

Some time you may have a requirement to copy an ORACLE_HOME from one location to another using OS level commands. However, one thing this will not do is update the Oracle Inventory. Having an incomplete Inventory will screw up any attempts at subsequent patching of this cloned ORACLE_HOME.

oracle@bdb ~]$ opatch lsinventory
Invoking OPatch 10.2.0.4.2

Oracle Interim Patch Installer version 10.2.0.4.2
Copyright (c) 2007, Oracle Corporation.  All rights reserved.

Oracle Home       : /opt/oracle/product/10.2.0.2
Central Inventory : /home/oracle/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 10.2.0.4.2
OUI version       : 10.2.0.2.0
OUI location      : /opt/oracle/product/10.2.0.2/oui
Log file location : /opt/oracle/product/10.2.0.2/cfgtoollogs/opatch/opatch2008-02-08_11-58-44AM.log

List of Homes on this system:

  Home name= OraDb10g_home1, Location= "/opt/oracle/product/10.2.0"
  Home name= OraDb10g_home2, Location= "/opt/oracle/product/11.1.0"
Inventory load failed... OPatch cannot load inventory for the given Oracle Home.
Possible causes are:
   Oracle Home dir. path does not exist in Central Inventory
   Oracle Home is a symbolic link
   Oracle Home inventory is corrupted
LsInventorySession failed: OracleHomeInventory gets null oracleHomeInfo

OPatch failed with error code 73

Oracle clearly support the ability to clone an ORACLE_HOME as the oui installer has an option to attach a cloned home to the Inventory:

 runInstaller -attachHome ORACLE_HOME="/opt/oracle/product/10.2.0.2" ORACLE_HOME_NAME="ORADB_102"

After you have attached the cloned home you can quite happily apply interim patches and patchsets to your newly cloned ORACLE_HOME.

ORA-00600 error (arguments [13009], [5000], …) on Oracle 10.2.0.2 Database

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by patrick on Jan 7th, 2008

We experienced the following error on one of our Oracle test databases:

ORA-00600: internal error code, arguments: [13009], [5000], [1], [17], [1], [], [], []

The statement which generated this error was a simple select ... for update nowait:

SELECT rowid, key, suffix, status
FROM table1
WHERE KEY = 'example_key'
    AND suffix = 'co.uk'
    AND status = 0
FOR UPDATE NOWAIT;

ERROR at line 2:
ORA-00600: internal error code, arguments: [13009], [5000], [1], [17], [1], [], [], []

However, performing the same operation accessing the row via Oracle rowid was successful:

SELECT rowid, key, suffix, status
FROM table1
WHERE rowid = 'AAAPBRAAPAAAAI2AAk'
FOR UPDATE NOWAIT;

ROWID              KEY           SUFFIX                       STATUS
------------------ ------------- ---------- ------------------------
AAAPBRAAPAAAAI2AAk example-key   co.uk                             0

This led me to believe there must be a problem with the access method used by the first statement, probably a corruption on an index. Performing an explain plan revealed two indexes were used by the first statement:

explain plan for
SELECT rowid, key, suffix, status
FROM table1
WHERE KEY = 'example_key'
    AND suffix = 'co.uk'
    AND status = 0
FOR UPDATE NOWAIT;

---------------------------------------------------------------------------------------
| Id  | Operation                        | Name       | Rows  | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |            |     1 |    12   (9)| 00:00:01 |
|   1 |  FOR UPDATE                      |            |       |            |          |
|   2 |   BITMAP CONVERSION TO ROWIDS    |            |     1 |    12   (9)| 00:00:01 |
|   3 |    BITMAP AND                    |            |       |            |          |
|   4 |     BITMAP CONVERSION FROM ROWIDS|            |       |            |          |
|*  5 |      INDEX RANGE SCAN            | IX1_TABLE1 |     1 |     3   (0)| 00:00:01 |
|   6 |     BITMAP CONVERSION FROM ROWIDS|            |       |            |          |
|   7 |      SORT ORDER BY               |            |       |            |          |
|*  8 |       INDEX RANGE SCAN           | PK_TABLE1  |     1 |     8   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

I decided to validate the structure of the table and indexes, but this revealed no errors:

analyze table table1 validate structure cascade;
Table analyzed.

I next rebuilt the indexes used by the query:

alter index pk_table1 rebuild online;
Index altered.

alter index ix1_table1 rebuild online;
Index altered.

This resolved the problem:

SELECT rowid, key, suffix, status
FROM table1
WHERE KEY = 'example_key'
    AND suffix = 'co.uk'
    AND status = 0
FOR UPDATE NOWAIT;

ROWID              KEY           SUFFIX                       STATUS
------------------ ------------- ---------- ------------------------
AAAPBRAAPAAAAI2AAk example-key   co.uk                             0

Oracle Linguistic Indexes

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Posted by jason on Jan 4th, 2008

Quite a bit of synchronicity occurred recently to the dba team here at Nominet. We have been following the writings of Richard Foote and in particular an article on Linguistic Indexes. I thought the article interesting though somewhat obscure and filed it at the back of mind.

We have been working on an upgrade to our ticketing/service management system and we came across a custom written query that was performing like a dog when run by the application, but producing fast performance when run by other schema owners:

SELECT count (*) FROM   user.domains d WHERE  d.key = 'nominet';

------------------------------------------------------------------------------------
| Id  | Operation         | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                |     1 |    14 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |                |     1 |    14 |            |          |
|*  2 |   INDEX RANGE SCAN| PK_DOMAINS_IDX |     1 |    14 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------

But when run as the user the application connects as:

SELECT count (*) FROM   user.domains d WHERE  d.key = 'nominet';

--------------------------------------------------------------------------------------------
| Id  | Operation                 | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                |     1 |    14 |  5209   (3)| 00:01:03 |
|   1 |  SORT AGGREGATE           |                |     1 |    14 |            |          |
|   2 |   PX COORDINATOR          |                |       |       |            |          |
|   3 |    PX SEND QC (RANDOM)    | :TQ10000       |     1 |    14 |            |          |
|   4 |     SORT AGGREGATE        |                |     1 |    14 |            |          |
|   5 |      PX BLOCK ITERATOR    |                |     1 |    14 |  5209   (3)| 00:01:03 |
|*  6 |       INDEX FAST FULL SCAN| PK_DOMAINS_IDX |     1 |    14 |  5209   (3)| 00:01:03 |
--------------------------------------------------------------------------------------------

We had other examples where a primary key index was being ignored by the application schema user which was doing a full table scan of a large table instead. The obvious conclusion to jump to (and the immediate one which we did), is a difference in the optimizer environments for the two schemas. We checked this using V$sql_optimizer_env, but the different child cursors produced for the query had the same optimizer environment settings. I even did a 10132 event dump of running the query in the 2 schemas, but I could not for the life of me put my finger on what was producing the differing plans.

It had to be something in the environment of the application schema, so we looked at roles & privileges until eventually we looked at logon triggers, and there it was:

begin
	       execute immediate 'alter session set NLS_COMP=LINGUISTIC';
	       execute immediate 'alter session set NLS_SORT=BINARY_CI';
	    end;

There is a good chapter in the Oracle documentation on Linguistic sorting, which you can read for yourself. There are a couple of things that are surprising for me, first these nls changes to a session do not seem to get into the V$sql_optimizer_env or other exposed views of what the optimizer environment was at run time, they clearly can have a large impact on the explain plan generated. Secondly, I don’t understand what was wrong with the traditional function-based index and having UPPER(column) whenever a case-insensitive search is required.

Of course in the interest of full disclosure I should point out that full explain plan had the following hiding after the plan:

       filter(NLSSORT("KEY",'nls_sort=''BINARY_CI''')=HEXTORAW('69747600') )

Be Careful when setting log_archive_max_processes

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by jason on Jan 3rd, 2008

Anyone interested in high-availability in an Oracle setting would do well to read the MAA pages on the Oracle website, there are many good papers and guidelines for optimising your environment contained there. For dataguard, I highly recommend the best practices for Redo Transport & Redo Apply papers. One recommendation contained there is increasing the log_archive_max_processes parameter, which controls the number of arc processes the instance uses, which in turn can help resolve any archive gaps. Unfortunately, I think I may have over egged this one, and as they say nothing is for free. Each server process takes some memory for it’s PGA and Oracle server background processes are no exception.

Lately, our monitoring system has been warning about low free memory on one of our RAC nodes, which seemed to be correlated to the number of oracle processes running. Hence I started looking at how much memory we were using for our pga:

SQL> select * from V$PGASTAT;

NAME		                        VALUE     UNIT
--------------------------------    ---------- ------------
aggregate PGA target parameter        199229440   bytes
aggregate PGA auto target              12451840   bytes
global memory bound                    39845888   bytes
total PGA inuse                      3658247168   bytes
total PGA allocated                  5270902784   bytes
maximum PGA allocated                6450857984   bytes
total freeable PGA memory             320667648   bytes
.
.

So this Instance had allocated 5GB to the pga, it’s interesting to note that the oracle documentation state that the total pga allocated will try to be kept under the aggegrate pga target, but we are an order of magnitude over here. I had thought the pga target parameter was maximum amount that could be allocated to each session, rather than a global total. Certainly my 190MB pga_aggregate_target is being overallocated by a long way.

I then looked at V$process to check what processes were using the most pga:

SQL> SELECT PROGRAM, PGA_USED_MEM, PGA_ALLOC_MEM/(1024*1024) "ALLOC(MB)",
PGA_FREEABLE_MEM FREEABLE, PGA_MAX_MEM
FROM V$PROCESS
order by pga_alloc_mem asc;

PROGRAM 	        PGA_USED_MEM  ALLOC(MB)FREEABLE PGA_MAX_MEM
----------------------  ------------  --------- ----- -------------
.
.
oracle@PROD (ARCg)	24274349     48.0507708   0    50384885
oracle@PROD (ARCf)	24276029     48.0507708   0    50384885
oracle@PROD (ARCe)	24274205     48.0507708   0    50384885
oracle@PROD (ARCd)	24277965     48.0507708   0    50384885
oracle@PROD (ARCc)	24259573     48.0507708   0    50384885
oracle@PROD (ARC5)	24274389     48.0507708   0    50384885
.
.

Basically, every archive background process was consuming nearly 50MB of memory. All 30 of them. That is 1.5GB of RAM. Thankfully, the log_archive_max_processes parameter is dynamic and I could remedy the situation with a simple alter system command, and after a little wait the unrequired arc processes were removed and the system had a lot more free memory.

Testing database locking with autonomous transactions

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 4.5 out of 5)
Loading ... Loading ...
Posted by chris on Dec 21st, 2007

We recently hit a problem caused by a database row lock. The actual update being made was not particularly important, as it was just a denormalized value pre-calculated for convenience. However, the consequences of the lock were rather nasty, with many processes waiting for the lock to be released. As a result I decided to change the code to allow the update to be deferred. So if the update encountered a lock, it would write its change to a queue table instead, to be applied later. I did this by opening a cursor with FOR UPDATE NOWAIT. This will throw an exception if it can’t get an immediate lock on the relevant rows.

So far so good, but how do you test something like this? The code in question has been around quite a while and is in PL/SQL (these days most new code we write is in Java). This means we use utPLSQL to test this functionality and so the tests themselves are also written in PL/SQL. The answer is the use of autonomous transactions. These allow you to mark a PL/SQL program unit with a special marker PRAGMA AUTONOMOUS_TRANSACTION that makes it run in its own transaction. So if the code we are testing is in the Make_The_Update procedure of the MyPackage package, the test code would look something like this:

.....

    --This Procedure does its thing in another
    --transaction to force an error to occur
    PROCEDURE Update_Externally IS
      PRAGMA AUTONOMOUS_TRANSACTION;
    BEGIN
       MyPackage.Make_The_Update(1000);
       COMMIT;
    END;

    PROCEDURE ut_check_locking IS
    BEGIN

      --Make an update in the first transaction, but don't commit
      MyPackage.Make_The_Update(10);
      --Make an update in the second transaction
      Update_Externally;
      --Check that it wrote to the queue table here
      ......

    END;

.....

For those unfamiliar with utPLSQL, test procedures are named ut_*. The other procedure is just there for convenience. I have left out much of the boilerplate, such as the setup and teardown code and so on.

So let’s look at what it does. The test procedure ut_check_locking makes an update, but doesn’t commit, then calls the other procedure to attempt an update in a different transaction. (Note that if our code wasn’t designed to deal with this situation, this would be a surefire way to generate a deadlock. Luckily Oracle detects these quite quickly, so if you make a mistake you’ll just get an exception after a few seconds). Finally, we check if the relevant rows have been written to our queue table. Doing this in utPLSQL is left as an exercise for the reader.

I initially had some problems with this, because I forgot that any test data needs to be visible from both transactions. Initially the main transaction just created it and then rolled it back at the end. This gave me some odd ORA-01410: Invalid ROWID errors as I was trying to update non-existent rows. Then I realised that I needed to explicitly commit this data in the setup.

I’ve used autonomous transactions in production code for doing things such as logging when you want the main transaction to roll back. But I’ve not seen them used as an aid to test issues around concurrent data access. But they are certainly helpful in that situation too.

DBCA and large filesystems

1 Star2 Stars3 Stars4 Stars5 Stars (3 votes, average: 1.67 out of 5)
Loading ... Loading ...
Posted by jason on Nov 9th, 2007

I never used to like dbca for creating a database, I much prefered the more hardcore (and more configurable) method of creating a script to create the database, and indeed all the production databases created here at nominet have been scripted rather than gui’d into existence. However I have found myself using dbca more and more, particularly since we upgraded to Oracle 10.2 a while back, dbca is much more useable, I used to find this java painfully slow, but 10.2 performs much faster. The reason I’ve been using it more is that our developers want empty databases that they then run the auto schema building tools on to give them the schemas to develop against but without data. I find dbca very useful for creating this type of database.
However it seems like dbca (10.2.0.3) has a limit on the size of filesystem that it can understand and throws an error when attempting to install onto a filesystem that is larger than it understands:

dbca file overflow

There was actually around 16TB of free space available rather than the -1 the gui is suggesting. Thankfully clicking ignore allowed the installation to proceed successfully.

Changing kernel parameters in Solaris 10

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 3.5 out of 5)
Loading ... Loading ...
Posted by jason on Oct 19th, 2007

I’ve been installing Oracle 10g on Solaris 10 x86-64. I was keen to avoid doing the old fashioned method of editing /etc/system but instead set the various shared memory segments dynamically using resource management and projects. The official Oracle installation guide for Solaris 10 x86-64 mentions how to do this, but it actually contains incorrect and incomplete information. Ignore the Oracle documentation and set your shared memory kernel parameters by doing the following:

sudo projadd -U oracle -G oinstall user.oracle
sudo projmod -sK "project.max-shm-memory=(priv,14294967295,deny)" user.oracle
sudo projmod -sK "process.max-sem-ids=(priv,1024,deny)" user.oracle
sudo projmod -sK "process.max-sem-nsems=(priv,1024,deny)" user.oracle
sudo projmod -sK "process.max-shm-ids=(priv,1024,deny)" user.oracle 

You can look at /etc/project to ensure you have correctly entered the parameters. I would recommend looking at id -p as the oracle user to make sure that the user you are running oracle as, is in the correct project:

id -p
uid=2000(oracle) gid=2000(oinstall) projid=100(user.oracle)

You can see the oracle user is using the user.oracle project, as opposed the default project.

Spring JMS with Oracle AQ

1 Star2 Stars3 Stars4 Stars5 Stars (3 votes, average: 4.67 out of 5)
Loading ... Loading ...
Posted by tom on Oct 4th, 2007

This probably isn’t news to many but I only recently noticed that Oracle AQ can manage JMS queues. We don’t use a full blown J2EE application server so our options when using a JMS broker is to either have a standalone server/cluster (more work for SAs) or to embed the JMS server in to our middleware (more work for me). By using the database we add no new dependencies or software so this is a pretty neat solution. Another big advantage for us is that it looks like we would be able to create messages inside PL/SQL procedures but I may be getting ahead of myself here.

There isn’t much to it but it was a lot harder to get to the same point than with the other JMS brokers I have been testing. There is a lot of documentation to go through and it is all fairly general. I also couldn’t find a good example so I’ve created a quick howto… Continue Reading »

Oracle Job Scheduling (dbms_scheduler)

1 Star2 Stars3 Stars4 Stars5 Stars (7 votes, average: 4.43 out of 5)
Loading ... Loading ...
Posted by patrick on Sep 28th, 2007

In version 10g of the Oracle Database the Oracle Scheduler (dbms_scheduler package) was introduced. This allows the scheduling of jobs within an Oracle database and has many more features than the older dbms_job package, which is still available in Oracle 10g.

You can schedule jobs to run at a designated date and time or upon the occurrence of an event. You can set priorities for and dependencies between jobs.

Continue Reading »

« Prev - Next »

Recent Posts

Highest Rated

Categories

Archives

Meta: