Saturday, September 5, 2015

OGG-01496 Failed to open target trail file


My pump process is down for so many days and when I tried to start the process it is getting ABENDED because it is trying to write in the remote trail file which is MISSING



GGSCI (oracleqa011.domain.com) 7> start PUMPQA

Sending START request to MANAGER ...
EXTRACT PUMPQA starting


GGSCI (oracleqa011.domain.com) 8> info PUMPQA

EXTRACT    PUMPQA  Last Started 2015-08-20 19:59   Status ABENDED
Checkpoint Lag       00:00:00 (updated 00:00:28 ago)
Log Read Checkpoint  File ./dirdat/cb000011  --> This is  source trail file
                     2015-08-20 19:59:30.000000  RBA 36679


GGSCI (oracleqa011.domain.com) 9> view report PUMPQA


2015-08-20 19:59:44  ERROR   OGG-01496  Failed to open target trail file /u01/oracle/TARGET/cb000008, at RBA 8448365.

2015-08-20 19:59:44  ERROR   OGG-01668  PROCESS ABENDING.


Here pump process is trying to write to the file “cb000008” where is has left before, since it’s been long ago this file doesn’t exist in the target location


Tried doing begin now command, but doesn’t work

GGSCI (oracleqa011.domain.com) 10> alter PUMPQA begin now
EXTRACT altered.

Still “ABENDED”    -->  because it will begin now but try to write to the same trail file


Solution:

Do “ETROLLOVER” for pump process, this will roll over to the next trail file in the sequence (here cb000009)

GGSCI (oracleqa011.domain.com) 10> ALTER EXTRACT PUMPQA ETROLLOVER

2015-08-20 20:02:22  INFO    OGG-01520  Rollover performed.  For each affected output trail of Version 10 or higher format, after starting the source extract, issue ALTER EXTSEQNO for that trail's reader (either pump EXTRACT or REPLICAT) to move the reader's scan to the new trail file;  it will not happen automatically.
EXTRACT altered.


GGSCI (oracleqa011.domain.com) 11> start PUMPQA

Sending START request to MANAGER ...
EXTRACT PUMPQA starting


GGSCI (oracleqa011.domain.com) 12> info PUMPQA

EXTRACT    PUMPQA  Last Started 2015-08-20 20:02   Status RUNNING
Checkpoint Lag       00:03:09 (updated 00:00:08 ago)
Log Read Checkpoint  File ./dirdat/cb000011
                     2015-08-20 19:59:30.000000  RBA 36679



In the target server I see “cb000009” has been created and writing into it.


ERROR OGG-01224 Address already in use.


DB version: 11.2.0.4 2-Node RAC
OS:  RHEL 6
GG version :  11.2.1.0.3

Today when i'm trying to start my manager process it is not starting 

GGSCI (oracledev01) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED
JAGENT      STOPPED
EXTRACT     ABENDED     EOECBP      291:56:23     27:02:50

GGSCI (oracledev01) 2> start mgr

Manager started.

But when I do info mgr it says manager is DOWN…..

GGSCI (oracledev01) 3> info mgr

Manager is DOWN!


GGSCI (oracledev01) 4> view params mgr

PORT 7809
DYNAMICPORTLIST 7840-9860


GGSCI (oracledev01) 5> view report mgr


***********************************************************************
                 Oracle GoldenGate Manager for Oracle
 Version 11.2.1.0.14 17547423 OGGCORE_11.2.1.0.0OGGBP_PLATFORMS_131022.0605
   Linux, x64, 64bit (optimized), Oracle 11g on Oct 22 2013 07:38:49

Copyright (C) 1995, 2013, Oracle and/or its affiliates. All rights reserved.


                    Starting at 2015-08-11 19:56:22
***********************************************************************

Operating System Version:
Linux
Version #1 SMP Fri May 29 10:16:43 EDT 2015, Release 2.6.32-504.23.4.el6.x86_64
Node: sl73orcdbdbq005
Machine: x86_64
                         soft limit   hard limit
Address Space Size   :    unlimited    unlimited
Heap Size            :    unlimited    unlimited
File Size            :    unlimited    unlimited
CPU Time             :    unlimited    unlimited

Process id: 36788

Parameters...

PORT 7809
DYNAMICPORTLIST 7840-9860


***********************************************************************
**                     Run Time Messages                             **
***********************************************************************


Source Context :
  SourceModule            : [mgr.main]
  SourceID                : [/scratch/aime1/adestore/views/aime1_adc4150267/oggcore/OpenSys/src/app/mgr/mgr.c]
  SourceFunction          : [init_functions]
  SourceLine              : [3390]
  ThreadBacktrace         : [8] elements
                          : [/gg/GG11/libgglog.so(CMessageContext::AddThreadContext()+0x1e) [0x7ff4595509fe]]
                          : [/gg/GG11/libgglog.so(CMessageFactory::CreateMessage(CSourceContext*, unsigned int, ...)+0x2cc) [0x7ff45954974c]]
                          : [/gg/GG11/libgglog.so(_MSG_ERR_TCP_GENERIC(CSourceContext*, char const*, CMessageFactory::MessageDisposition)+0x31) [0x7ff4595318a5]]
                          : [./mgr(init_functions(int, char**)+0x7f5) [0x4511c5]]
                          : [./mgr(main_loop(int, char**)+0x4c) [0x454aec]]
                          : [./mgr(main+0xf2) [0x455362]]
                          : [/lib64/libc.so.6(__libc_start_main+0xfd) [0x3d9361ed5d]]
                          : [./mgr(__gxx_personality_v0+0x142) [0x43efca]]

2015-08-11 19:56:22  ERROR   OGG-01224  Address already in use.

2015-08-11 19:56:22  ERROR   OGG-01668  PROCESS ABENDING.


Error in ggserr.log:

2015-08-11 19:56:10  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): info mgr.
2015-08-11 19:56:19  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): info all.
2015-08-11 19:56:22  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): start mgr.
2015-08-11 19:56:22  ERROR   OGG-01224  Oracle GoldenGate Manager for Oracle, mgr.prm:  Address already in use.
2015-08-11 19:56:22  ERROR   OGG-01668  Oracle GoldenGate Manager for Oracle, mgr.prm:  PROCESS ABENDING.


CAUSE:

Previous mgr process is still running, which used port 7809 according to parameter file.


SOLUTION:

Use other port and start
OR
To release the port perform below steps


As a root or Goldengate owner check the port

[root@oracledev01~]#  netstat -nap | grep 7809
tcp        0      0 0.0.0.0:7809                0.0.0.0:*                   LISTEN      44402/./mgr

From the above output we see mgr is already running on port 7809 so kill this old process and start again

[root@oracledev01~]# kill -9 44402
[root@oracledev01~]# netstat -nap | grep 7809
Nothing displays

[oracle@sl73orcdbdbq005 GG11]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.14 17547423 OGGCORE_11.2.1.0.0OGGBP_PLATFORMS_131022.0605_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Oct 22 2013 11:03:39

Copyright (C) 1995, 2013, Oracle and/or its affiliates. All rights reserved.


GGSCI (oracledev01) 1> start mgr

Manager started.


GGSCI (oracledev01) 2> info mgr


Manager is running (IP port oracledev01.7809).


Thursday, August 13, 2015

ENABLE automatic statistics collection in 11g.


SQL> SHO PARAMETER STATISTICS_LEVEL

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
statistics_level                     string      TYPICAL

SQL> select client_name, status,attributes,service_name from dba_autotask_client;

CLIENT_NAME                              STATUS          ATTRIBUTES                                                   SERVICE_NAME
---------------------------------------- -------- ------------------------------------------------------------ ----------------------------
auto optimizer stats collection          DISABLED  ON BY DEFAULT, VOLATILE, SAFE TO KILL
auto space advisor                       ENABLED  ON BY DEFAULT, VOLATILE, SAFE TO KILL
sql tuning advisor                       ENABLED  ONCE PER WINDOW, ON BY DEFAULT, VOLATILE, SAFE TO KILL


SQL> select client_name,status from DBA_AUTOTASK_TASK;

CLIENT_NAME                              STATUS
---------------------------------------- --------
sql tuning advisor                       ENABLED
auto space advisor                       ENABLED


SQL> SELECT window_name,TO_CHAR(window_next_time,'DD-MON-YY HH24:MI:SS'),sql_tune_advisor, optimizer_stats, segment_advisor FROM DBA_AUTOTASK_WINDOW_CLIENTS;

WINDOW_NAME                    TO_CHAR(WINDOW_NEXT_TIME,'D   SQL_TUNE   OPTIMIZE   SEGMENT_
------------------------------ --------------------------- -------- -------- --------
MONDAY_WINDOW                  17-AUG-15 22:00:00          ENABLED  DISABLED ENABLED
TUESDAY_WINDOW                 18-AUG-15 22:00:00          ENABLED  DISABLED ENABLED
WEDNESDAY_WINDOW               19-AUG-15 22:00:00          ENABLED  DISABLED ENABLED
THURSDAY_WINDOW                13-AUG-15 22:00:00          ENABLED  DISABLED ENABLED
FRIDAY_WINDOW                  14-AUG-15 22:00:00          ENABLED  DISABLED ENABLED
SATURDAY_WINDOW                15-AUG-15 06:00:00          ENABLED  DISABLED ENABLED
SUNDAY_WINDOW                  16-AUG-15 06:00:00          ENABLED  DISABLED ENABLED

7 rows selected.

SQL>
SQL> SELECT  ENABLED FROM DBA_SCHEDULER_PROGRAMS WHERE PROGRAM_NAME = 'GATHER_STATS_PROG';

ENABL
-----
TRUE

Check if this table has any pending stats

SQL> select LAST_ANALYZED,NUM_ROWS from dba_TAB_PENDING_STATS where TABLE_NAME='TEST_TAB';

no rows selected

Enable optimizer stats :

SQL>  BEGIN
     DBMS_AUTO_TASK_ADMIN.ENABLE(
     client_name => 'auto optimizer stats collection',
     operation => NULL,
     window_name => NULL);
     END;
     / 

PL/SQL procedure successfully completed.


SQL> SELECT window_name,TO_CHAR(window_next_time,'DD-MON-YY HH24:MI:SS'),sql_tune_advisor, optimizer_stats, segment_advisor FROM DBA_AUTOTASK_WINDOW_CLIENTS;

WINDOW_NAME                TO_CHAR(WINDOW_NEXT_TIME,'D   SQL_TUNE   OPTIMIZE       SEGMENT_
------------------------------        ---------------------------         --------          --------         --------
MONDAY_WINDOW                17-AUG-15 22:00:00          ENABLED  ENABLED  ENABLED
TUESDAY_WINDOW               18-AUG-15 22:00:00           ENABLED  ENABLED  ENABLED
WEDNESDAY_WINDOW           19-AUG-15 22:00:00          ENABLED  ENABLED  ENABLED
THURSDAY_WINDOW                13-AUG-15 22:00:00          ENABLED  ENABLED  ENABLED
FRIDAY_WINDOW                   14-AUG-15 22:00:00          ENABLED  ENABLED  ENABLED
SATURDAY_WINDOW              15-AUG-15 06:00:00          ENABLED  ENABLED  ENABLED
SUNDAY_WINDOW                   16-AUG-15 06:00:00          ENABLED  ENABLED  ENABLED

7 rows selected.


Read more:



OGG-01224 Oracle GoldenGate Manager for Oracle, mgr.prm: Address already in use.



DB version: 11.2.0.4 2-Node RAC
OS:  RHEL 6
GG version :  11.2.1.0.3

GGSCI (oracledev01) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED
JAGENT      STOPPED
EXTRACT     ABENDED     EOECBP      291:56:23     27:02:50

GGSCI (oracledev01) 2> start mgr

Manager started.

But when I do info mgr it says manager is DOWN…..

GGSCI (oracledev01) 3> info mgr

Manager is DOWN!


GGSCI (oracledev01) 4> view params mgr

PORT 7809
DYNAMICPORTLIST 7840-9860


GGSCI (oracledev01) 5> view report mgr


***********************************************************************
                 Oracle GoldenGate Manager for Oracle
 Version 11.2.1.0.14 17547423 OGGCORE_11.2.1.0.0OGGBP_PLATFORMS_131022.0605
   Linux, x64, 64bit (optimized), Oracle 11g on Oct 22 2013 07:38:49

Copyright (C) 1995, 2013, Oracle and/or its affiliates. All rights reserved.


                    Starting at 2015-08-11 19:56:22
***********************************************************************

Operating System Version:
Linux
Version #1 SMP Fri May 29 10:16:43 EDT 2015, Release 2.6.32-504.23.4.el6.x86_64
Node: sl73orcdbdbq005
Machine: x86_64
                         soft limit   hard limit
Address Space Size   :    unlimited    unlimited
Heap Size            :    unlimited    unlimited
File Size            :    unlimited    unlimited
CPU Time             :    unlimited    unlimited

Process id: 36788

Parameters...

PORT 7809
DYNAMICPORTLIST 7840-9860


***********************************************************************
**                     Run Time Messages                             **
***********************************************************************


Source Context :
  SourceModule            : [mgr.main]
  SourceID                : [/scratch/aime1/adestore/views/aime1_adc4150267/oggcore/OpenSys/src/app/mgr/mgr.c]
  SourceFunction          : [init_functions]
  SourceLine              : [3390]
  ThreadBacktrace         : [8] elements
                          : [/gg/GG11/libgglog.so(CMessageContext::AddThreadContext()+0x1e) [0x7ff4595509fe]]
                          : [/gg/GG11/libgglog.so(CMessageFactory::CreateMessage(CSourceContext*, unsigned int, ...)+0x2cc) [0x7ff45954974c]]
                          : [/gg/GG11/libgglog.so(_MSG_ERR_TCP_GENERIC(CSourceContext*, char const*, CMessageFactory::MessageDisposition)+0x31) [0x7ff4595318a5]]
                          : [./mgr(init_functions(int, char**)+0x7f5) [0x4511c5]]
                          : [./mgr(main_loop(int, char**)+0x4c) [0x454aec]]
                          : [./mgr(main+0xf2) [0x455362]]
                          : [/lib64/libc.so.6(__libc_start_main+0xfd) [0x3d9361ed5d]]
                          : [./mgr(__gxx_personality_v0+0x142) [0x43efca]]

2015-08-11 19:56:22  ERROR   OGG-01224  Address already in use.

2015-08-11 19:56:22  ERROR   OGG-01668  PROCESS ABENDING.


Error in ggserr.log:

2015-08-11 19:56:10  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): info mgr.
2015-08-11 19:56:19  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): info all.
2015-08-11 19:56:22  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): start mgr.
2015-08-11 19:56:22  ERROR   OGG-01224  Oracle GoldenGate Manager for Oracle, mgr.prm:  Address already in use.
2015-08-11 19:56:22  ERROR   OGG-01668  Oracle GoldenGate Manager for Oracle, mgr.prm:  PROCESS ABENDING.


CAUSE:

Previous mgr process is still running, which used port 7809 according to parameter file.


SOLUTION:

Use other port and start
OR
To release the port perform below steps


As a root or Goldengate owner check the port

[root@oracledev01~]#  netstat -nap | grep 7809
tcp        0      0 0.0.0.0:7809                0.0.0.0:*                   LISTEN      44402/./mgr

From the above output we see mgr is running on port 7809 so kill this old process and start again

[root@oracledev01~]# kill -9 44402
[root@oracledev01~]# netstat -nap | grep 7809

[oracle@sl73orcdbdbq005 GG11]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.14 17547423 OGGCORE_11.2.1.0.0OGGBP_PLATFORMS_131022.0605_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Oct 22 2013 11:03:39

Copyright (C) 1995, 2013, Oracle and/or its affiliates. All rights reserved.


GGSCI (oracledev01) 1> start mgr

Manager started.


GGSCI (oracledev01) 2> info mgr

Manager is running (IP port oracledev01.7809).



Wednesday, August 5, 2015

ORA-04045: errors during recompilation/revalidation of GG_USER.DDLREPLICATION


DB version: 11.2.0.4
OS:  RHEL 6
GG version :  11.2.1.0.3

For any DDL operation performed in the database I’m hitting the below error


EX:
SQL> ALTER TABLE USER.TEST MODIFY (emp_id VARCHAR2(100) );
ALTER TABLE USER.TEST MODIFY (emp_id VARCHAR2(100) )
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-04045: errors during recompilation/revalidation of GGS_ADMIN.DDLREPLICATION
ORA-04067: not executed, package body "GGS_ADMIN.DDLREPLICATION" does not exist
ORA-06508: PL/SQL: could not find program unit being called:
"GGS_ADMIN.DDLREPLICATION"
ORA-06512: at line 1100
ORA-04067: not executed, package body "GGS_ADMIN.DDLREPLICATION" does not exist
ORA-06508: PL/SQL: could not find program unit being called:
"GGS_ADMIN.DDLREPLICATION"
ORA-06508: PL/SQL: could not find program unit being called:
"GGS_ADMIN.DDLREPLICATION"
ORA-06512: at line 977
ORA-04045: errors during recompilation/revalidation of GGS_ADMIN.DDLREPLICATION
ORA-04067: not executed, package body "GGS_ADMIN.DDLREPLICATION" does not exist
ORA-06508: PL/SQL: could not find program unit being called:
"GGS_ADMIN.DDLREPLICATION"
ORA-06512: at line 1100
ORA-04067: not executed, package body "GGS_ADMIN.DDLREPLICATION" does not exist
ORA-06508: PL/SQL: could not find program unit being called:
"GGS_ADMIN.DDLREPLICATION"
ORA-06508: PL/SQL: could not find program unit being called:
"GGS_ADMIN.DDLREPLICATION"
ORA-04045: errors during recompilation/revalidation of GGS_ADMIN.DDLREPLICATION
ORA-04067: not executed, package body "GGS_ADMIN.DDLREPLICATION" does not exist
ORA-06508: PL/SQL: could not find program unit being called:
"GGS_ADMIN.DDLREPLICATION"
ORA-06512: at line 1100
ORA-04067: not executed, package body "GGS_ADMIN.DDLREPLICATION" does not exist
ORA-06508: PL/SQL: could not find program unit being called:
"GGS_ADMIN.DDLREPLICATION"
ORA-06508: PL/SQL: could not find program unit being called:
"GGS_ADMIN.DDLREPLICATION"

And I see there are few invalid objects in the database in which I see goldengate package too

SQL> select count(*) from  dba_objects where status='INVALID';
COUNT(*)
----------
        35

SQL> select owner,object_name,object_type,status from dba_objects where object_name='DDLREPLICATION';

OWNER                                 OBJECT_NAME            OBJECT_TYPE                                               STATUS
--------------------------------- ----------------------    -----------------------------------------------         ----------------------
GGS_ADMIN                         DDLREPLICATION         PACKAGE BODY                                         INVALID
GGS_ADMIN                         DDLREPLICATION         PACKAGE                                                     VALID


Tried to run the @?/rdbms/admin/utlrp.sql” package but same error .

Solution:

DISABLE “GGS_DDL_TRIGGER_BEFORE” trigger and run the utlrp script to make objects validate and then enable back the DDL trigger

SQL> select trigger_name,status,action_type from dba_triggers where owner='SYS' and trigger_name='GGS_DDL_TRIGGER_BEFORE';

TRIGGER_NAME                      STATUS                   ACTION_TYPE
--------------------------------- ------------------------ ---------------------------------
GGS_DDL_TRIGGER_BEFORE            ENABLED                  PL/SQL

SQL> alter trigger sys.GGS_DDL_TRIGGER_BEFORE disable ;
Trigger altered.


SQL> select trigger_name,status,action_type from dba_triggers where owner='SYS' and trigger_name='GGS_DDL_TRIGGER_BEFORE';

TRIGGER_NAME                      STATUS                   ACTION_TYPE
--------------------------------- ------------------------ ---------------------------------
GGS_DDL_TRIGGER_BEFORE            DISABLED      PL/SQL



SQL> @?/rdbms/admin/utlrp.sql

SQL> select owner,object_name,object_type,status from dba_objects where object_name='DDLREPLICATION';

OWNER                             OBJECT_NAME            OBJECT_TYPE                                               STATUS
--------------------------------- ---------------------- ------------------------------------------------   -----------------------------
GGS_ADMIN                         DDLREPLICATION         PACKAGE BODY                                          VALID
GGS_ADMIN                         DDLREPLICATION         PACKAGE                                                   VALID


SQL> alter trigger sys.GGS_DDL_TRIGGER_BEFORE enable ;

Trigger altered.

SQL> select trigger_name,status,action_type from dba_triggers where owner='SYS' and trigger_name='GGS_DDL_TRIGGER_BEFORE';

TRIGGER_NAME                      STATUS                   ACTION_TYPE
--------------------------------- ------------------------ ---------------------------------
GGS_DDL_TRIGGER_BEFORE            ENABLED                  PL/SQL


Now I can do all my DDL operations. J
If you still see the same error then again disable the TRIGGER and reinstall the DDL replication package

NOTE:  same thing applies when we do any patching work or running any scripts (catupgrd,catproc,catuppst,utlrp,etc.,)


Refer:
  Do I Need To Disable The GoldenGate DDL Trigger Before An Oracle DB Upgrade or PSU patching? (Doc ID 971222.1)

Thursday, June 25, 2015

ORA-39202: Data cannot be filtered or selected in ESTIMATE_ONLY jobs

I have a huge table and I want to find the size of the rows which are 6 months old out of historical data in the table using datapump

$ expdp directory=EXP_DIR nologfile=y query=table:\"where CRT_TS\>'01-JAN-15 06.07.03.799000000 AM'\" tables=SCHEMA.TABLE  estimate_only=y

Export: Release 11.2.0.4.0 - Production on Thu Jun 25 15:48:46 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Username: / as sysdba

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label Security,
OLAP, Data Mining and Real Application Testing options
ORA-39001: invalid argument value
ORA-39202: Data cannot be filtered or selected in ESTIMATE_ONLY jobs.


According to oracle support this is reported as BUG in 11gr2, we cannot filter the data rows when doing ESTIMATES at the same time

“Bug 9536364 : DOC: DATA_FILTER AND ESTIMATE OPTIONS CANNOT BE USED TOGETHER IN DBMS_DATAPUMP”


Action: Do not restrict data handling on jobs that cannot support data filtering.





Thursday, April 23, 2015

Oracle Cluster Health Monitor (CHM) using large amount of space (crfclust.bdb)

Last night my rac 2 node server went down for OS patcing and rebooted but all CRS resources not coming up on both the node after node reboots:

conn as root user and check all resources

[root@oradev11 bin]# ./crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE
ora.cluster_interconnect.haip
      1        ONLINE  OFFLINE
ora.crf
      1        ONLINE  OFFLINE
ora.crsd
      1        ONLINE  OFFLINE
ora.cssd
      1        ONLINE  OFFLINE
ora.cssdmonitor
      1        ONLINE  ONLINE       oradev11
ora.ctssd
      1        ONLINE  OFFLINE
ora.diskmon
      1        ONLINE  OFFLINE
ora.drivers.acfs
      1        ONLINE  ONLINE       oradev11
ora.evmd
      1        ONLINE  OFFLINE
ora.gipcd
      1        ONLINE  OFFLINE
ora.gpnpd
      1        ONLINE  OFFLINE
ora.mdnsd
      1        ONLINE  OFFLINE                               STARTING


CRS alert log says:

[root@oradev11 ] # cd $GRID_HOME/log/hostname
[root@oradev11 oradev11]# tail -50f alertoradev11.log

o/p trimmed………

2015-04-22 20:06:01.173:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(24990)]CRS-5818:Aborted
2015-04-22 20:06:05.177:
[ohasd(12696)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.mdnsd'. Details at (:CRSPE00111:) {0:0:2} in /u01/app/11.2.0.4/grid/log/sl73vmhasd/ohasd.log.
2015-04-22 20:06:05.658:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(25614)]CRS-0037:An error occurred while attempting to write to file "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagenagent_grid.log". Additional diagnostics: LFI-00004: Call to lfibwrt() failed.
LFI-01518: write() failed(OSD return value = 28) in slfiwl.

2015-04-22 20:06:05.659:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(25614)]CRS-0004:logging terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagent_gridgrid.log"
2015-04-22 20:06:06.176:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(25631)]CRS-0037:An error occurred while attempting to write to file "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagenagent_grid.log". Additional diagnostics: LFI-00004: Call to lfibwrt() failed.
LFI-01518: write() failed(OSD return value = 28) in slfiwl.

2015-04-22 20:06:06.176:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(25631)]CRS-0004:logging terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagent_gridgrid.log"
2015-04-22 20:06:06.272:
[gpnpd(25644)]CRS-0037:An error occurred while attempting to write to file "/u01/app/11.2.0.4/grid/log/oradev11/gpnpd/gpnpd.log". Additional diagnostics: LFI-00004: ibwrt() failed.
LFI-01518: write() failed(OSD return value = 28) in slfiwl.

2015-04-22 20:06:06.272:
[gpnpd(25644)]CRS-0004:logging terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/gpnpd/gpnpd.log"
2015-04-22 20:06:09.314:
[gpnpd(25644)]CRS-2329:GPNPD on node oradev11 shutdown.
2015-04-22 20:08:06.226:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(25631)]CRS-5818:Aborted command 'start' for resource 'ora.gpnpd'. Details at (:CRSAGF00113:) {0:0:2} in /u01/app/11.2.0.4/grid/logbd001/agent/ohasd/oraagent_grid/oraagent_grid.log.
2015-04-22 20:08:10.229:
[ohasd(12696)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.gpnpd'. Details at (:CRSPE00111:) {0:0:2} in /u01/app/11.2.0.4/grid/log/sl73vmhasd/ohasd.log.
2015-04-22 20:08:10.710:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(26582)]CRS-0037:An error occurred while attempting to write to file "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagenagent_grid.log". Additional diagnostics: LFI-00004: Call to lfibwrt() failed.
LFI-01518: write() failed(OSD return value = 28) in slfiwl.

2015-04-22 20:08:10.710:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(26582)]CRS-0004:logging terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagent_gridgrid.log"
2015-04-22 20:08:11.280:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(26604)]CRS-0037:An error occurred while attempting to write to file "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagenagent_grid.log". Additional diagnostics: LFI-00004: Call to lfibwrt() failed.
LFI-01518: write() failed(OSD return value = 28) in slfiwl.

2015-04-22 20:08:11.280:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(26604)]CRS-0004:logging terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/agent/ohasd/oraagent_gridgrid.log"
2015-04-22 20:08:11.347:
[mdnsd(26617)]CRS-0037:An error occurred while attempting to write to file "/u01/app/11.2.0.4/grid/log/oradev11/mdnsd/mdnsd.log". Additional diagnostics: LFI-00004: ibwrt() failed.
LFI-01518: write() failed(OSD return value = 28) in slfiwl.

2015-04-22 20:08:11.347:
[mdnsd(26617)]CRS-0004:logging terminated for the process. log file: "/u01/app/11.2.0.4/grid/log/oradev11/mdnsd/mdnsd.log"
2015-04-22 20:08:11.351:
[mdnsd(26617)]CRS-5602:mDNS service stopping by request.

After so much of time spending on troubleshooting I checked the space on server  and then released it is because of space issue on a mount point where my GRID home located, due to which CRS resources are not coming up

[root@oradev11 bin]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/rootvg-rootlv
                      5.8G  1.8G  3.8G  33% /
tmpfs                 3.0G     0  3.0G   0% /dev/shm
/dev/sda1             190M   86M   95M  48% /boot
/dev/mapper/rootvg-homelv
                      2.0G  9.2M  1.8G   1% /home
/dev/mapper/rootvg-optlv
                      9.8G  2.0G  7.3G  22% /opt
/dev/mapper/rootvg-securlv
                      1.5G  211M  1.2G  16% /opt/security
/dev/mapper/rootvg-tmplv
                      2.0G  375M  1.5G  21% /tmp
/dev/mapper/rootvg-varlv
                      9.8G  1.1G  8.2G  12% /var
/dev/mapper/datavg-gridbaselv
                       50G   49G     0 100% /u01/app
/dev/mapper/datavg-rdbmsbaselv
                       50G  4.8G   42G  11% /u01/app/oracle
/dev/mapper/datavg-adrrepolv
                       50G  2.6G   45G   6% /oratrace
/dev/mapper/datavg-oemagentlv
                       20G  651M   18G   4% /u01/app/emagent
/dev/mapper/datavg-gglv
                       50G   52M   47G   1% /gg
/dev/mapper/datavg-dbawslv
                       99G   16G   79G  17% /oraworkspace
/dev/mapper/datavg-auditfslv
                       50G  230M   47G   1% /oradbaudit
/dev/mapper/datavg-dbtoolslv
                      9.8G   86M  9.2G   1% /oratools

Checking to see if i can delete anything on /u01/app mount point and i see "crfclust.bdb" is consuming much space then any other

[root@oradev11 bin]# cd ../crf/db
[root@oradev11 db]# ls -lrht
total 4.0K
drwxr-x--- 2 root oinstall 4.0K Apr 22 20:45 oradev11
[root@oradev11 db]# cd oradev11

[root@oradev11 oradev11]# ls -lrth
total 38G
-rw-r--r-- 1 root root 1.1M Sep  8  2014 08-SEP-2014-09:24:06.txt
-rw-r--r-- 1 root root 1.9M Sep  8  2014 08-SEP-2014-10:07:28.txt
-rw-r--r-- 1 root root 1.2M Sep  8  2014 08-SEP-2014-10:20:00.txt
-rw-r----- 1 root root 8.0K Nov 20 09:44 repdhosts.bdb
-rw-r--r-- 1 root root  74K Mar  9 10:53 09-MAR-2015-10:53:37.txt
-rw-r--r-- 1 root root 856K Mar  9 10:56 09-MAR-2015-10:56:42.txt
-rw-r--r-- 1 root root  77K Mar 13 19:21 13-MAR-2015-19:21:26.txt
-rw-r--r-- 1 root root 218K Mar 13 19:21 13-MAR-2015-19:21:44.txt
-rw-r----- 1 root root  16M Apr 22 12:19 log.0000007983
-rw-r----- 1 root root  24K Apr 22 20:42 __db.001
-rw-r--r-- 1 root root 115M Apr 22 20:42 oradev11.ldb
-rw-r----- 1 root root 8.0K Apr 22 20:43 crfconn.bdb
-rw-r--r-- 1 root root 777K Apr 22 20:45 22-APR-2015-20:45:53.txt
-rw-r----- 1 root root  56K Apr 22 20:56 __db.006
-rw-r----- 1 root root 392K Apr 22 20:56 __db.002
-rw-r----- 1 root root 812M Apr 22 20:56 crfloclts.bdb
-rw-r----- 1 root root 668M Apr 22 20:56 crfcpu.bdb
-rw-r----- 1 root root 743M Apr 22 20:56 crfalert.bdb
-rw-r----- 1 root root 526M Apr 22 20:56 crfts.bdb
-rw-r----- 1 root root 607M Apr 22 20:56 crfhosts.bdb
-rw-r----- 1 root root  34G Apr 22 20:56 crfclust.bdb
-rw-r----- 1 root root  16M Apr 22 20:56 log.0000007984
-rw-r----- 1 root root 1.2M Apr 22 20:56 __db.005
-rw-r----- 1 root root 2.1M Apr 22 20:56 __db.004
-rw-r----- 1 root root 2.6M Apr 22 20:56 __db.003

From the above output I see only “crfclust.bdb” is consuming lot of space, then I followed the steps given in the oracle doc to free up the space on the server


Stop ora.crf ……….

[root@oradev11 bin]# ./crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'oradev11'
CRS-2677: Stop of 'ora.crf' on 'oradev11' succeeded

[root@oradev11 oradev11]# rm crfclust.bdb

[root@oradev11 oradev11]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/rootvg-rootlv
                      5.8G  1.8G  3.8G  33% /
tmpfs                 3.0G  854M  2.2G  28% /dev/shm
/dev/sda1             190M   86M   95M  48% /boot
/dev/mapper/rootvg-homelv
                      2.0G  9.2M  1.8G   1% /home
/dev/mapper/rootvg-optlv
                      9.8G  2.0G  7.3G  22% /opt
/dev/mapper/rootvg-securlv
                      1.5G  211M  1.2G  16% /opt/security
/dev/mapper/rootvg-tmplv
                      2.0G  376M  1.5G  21% /tmp
/dev/mapper/rootvg-varlv
                      9.8G  1.1G  8.2G  12% /var
/dev/mapper/datavg-gridbaselv
                       50G   13G   34G  28% /u01/app
/dev/mapper/datavg-rdbmsbaselv
                       50G  4.8G   42G  11% /u01/app/oracle
/dev/mapper/datavg-adrrepolv
                       50G  2.6G   45G   6% /oratrace
/dev/mapper/datavg-oemagentlv
                       20G  651M   18G   4% /u01/app/emagent
/dev/mapper/datavg-gglv
                       50G   52M   47G   1% /gg
/dev/mapper/datavg-dbawslv
                       99G   16G   79G  17% /oraworkspace
/dev/mapper/datavg-auditfslv
                       50G  231M   47G   1% /oradbaudit
/dev/mapper/datavg-dbtoolslv
                      9.8G   86M  9.2G   1% /oratools
/dev/asm/ggatevol-387
                       20G  562M   20G   3% /gg/GG11
                                                           
Start again………..

[root@oradev11 bin]# ./crsctl start res ora.crf -init
CRS-2672: Attempting to start 'ora.crf' on 'oradev11'

CRS-2676: Start of 'ora.crf' on 'oradev11' succeeded

[root@oradev11 bin]# ./crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       oradev11           Started
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       oradev11
ora.crf
      1        ONLINE  ONLINE       oradev11
ora.crsd
      1        ONLINE  ONLINE       oradev11
ora.cssd
      1        ONLINE  ONLINE       oradev11
ora.cssdmonitor
      1        ONLINE  ONLINE       oradev11
ora.ctssd
      1        ONLINE  ONLINE       oradev11           OBSERVER
ora.diskmon
      1        OFFLINE OFFLINE
ora.drivers.acfs
      1        ONLINE  ONLINE       oradev11
ora.evmd
      1        ONLINE  ONLINE       oradev11
ora.gipcd
      1        ONLINE  ONLINE       oradev11
ora.gpnpd
      1        ONLINE  ONLINE       oradev11
ora.mdnsd
      1        ONLINE  ONLINE       oradev11


Now I see all the resources are up and running

Refer:
Oracle Cluster Health Monitor (CHM) using large amount of space (more than default) (Doc ID 1343105.1)