Tutorial :How do I fix Postgres so it will start after an abrupt shutdown?



Question:

Due to a sudden power outage, the PostGres server running on my local machine shut down abruptly. After rebooting, I tried to restart postgres and I get this error:

$ pg_ctl -D /usr/local/pgsql/data restart

pg_ctl: PID file "/usr/local/pgsql/data/postmaster.pid" does not exist  Is server running?  starting server anyway  server starting  $:/usr/local/pgsql/data$ LOG:  database system shutdown was interrupted at 2009-02-28 21:06:16   LOG:  checkpoint record is at 2/8FD6F8D0  LOG:  redo record is at 2/8FD6F8D0; undo record is at 0/0; shutdown FALSE  LOG:  next transaction ID: 0/1888104; next OID: 1711752  LOG:  next MultiXactId: 2; next MultiXactOffset: 3  LOG:  database system was not properly shut down; automatic recovery in progress  LOG:  redo starts at 2/8FD6F918  LOG:  record with zero length at 2/8FFD94A8  LOG:  redo done at 2/8FFD9480  LOG:  could not fsync segment 0 of relation 1663/1707047/1707304: No such file or directory  FATAL:  storage sync failed on magnetic disk: No such file or directory  LOG:  startup process (PID 5465) exited with exit code 1  LOG:  aborting startup due to startup process failure  

There is no postmaster.pid file in the data directory. What possibly could be the reason for this sort of behavior and of course what is the way out?


Solution:1

You'd need to pg_resetxlog. Your database can be in an inconsistent state after this though, so dump it with pg_dumpall, recreate and import back.

A cause for this could be:

  • You have not turned off hardware write cache on disk, which often prevents the OS from making sure data is written before it reports successful write to application. Check with

    hdparm -I /dev/sda

    If it shows "*" before "Write cache" then this could be the case. Source of PostgreSQL has a program src/tools/fsync/test_fsync.c, which tests speed of syncing data with disk. Run it - if it reports all times shorter than, say, 3 seconds than your disk is lying to OS - on a 7500rpm disks a test of 1000 writes to the same place would need at least 8 seconds to complete (1000/(7500rpm/60s)) as it can only write once per route. You'd need to edit this test_fsync.c if your database is on another disk than /var/tmp partition - change

    #define FSYNC_FILENAME "/var/tmp/test_fsync.out"

    to

    #define FSYNC_FILENAME "/usr/local/pgsql/data/test_fsync.out"

  • Your disk is failing and has a bad block, check with badblocks.

  • You have a bad RAM, check with memtest86+ for at least 8 hours.


Solution:2

Reading a few similar messages in the archives of the PostgreSQL mailing list ("storage sync failed on magnetic disk: No such file or directory") seems to indicate that there is a very serious hardware trouble, much worse than a simple power failure. You may have to prepare yourself to restore from backups.


Solution:3

First thing I'd try is running fsck on that disk if you haven't already done so.


Solution:4

Run start instead of restart. Execute the below command:

$pg_ctl -D /usr/local/pgsql/data start  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »