Estou estranhando em um servidor novo meu. No antigo esses valores ficam bem baixos.
Agora no novo algo assim:
blocks read/sec blocks write/sec
sda2 92.02 2241.60 1168.47 895250 466664
No outro que recebe muito mais requests os Blocks read/sec não chega a 250
Não existe nenhum processo estranho rodando.
------------------
Algo estranho que aconteceu agora pouco foi que o servidor ficou totalmente off-line, entao rebootei ele por um painel que o datacenter fornece, e voltou a funcionar, e nisso o processo:
./jre/bin/java -Djava.compiler=NONE -cp /usr/StorMan/RaidMan.jar com.ibm.sysmgt.raidmgr.agent.ManagementAgent
começou a comer um pouco mais de processador e memoria que o normal. Depois normalizou.
Seria algum disco com problema? Meu server usa RAID 1

Page 1 of 1
Blocks Read/sec e Blocks Written/Sec. É normal valor muito alto?
#2
Posted 05 agosto 2009 - 11:41
encontrei as seguintes linhas de erro em /var/log/messsages
----------
Opened configuration file /etc/smartd.conf
Aug 5 09:14:24 host338728 smartd[5270]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
Aug 5 09:14:24 host338728 smartd[5270]: Problem creating device name scan list
Aug 5 09:14:24 host338728 smartd[5270]: Device: /dev/sda, opened
Aug 5 09:14:24 host338728 smartd[5270]: Device: /dev/sda, Bad IEC (SMART) mode page, err=5, skip device
Aug 5 09:14:24 host338728 smartd[5270]: Device: /dev/sdb, opened
Aug 5 09:14:24 host338728 smartd[5270]: Device: /dev/sdb, Bad IEC (SMART) mode page, err=5, skip device
Aug 5 09:14:24 host338728 smartd[5270]: Monitoring 0 ATA and 0 SCSI devices
Aug 5 09:14:24 host338728 smartd[5272]: smartd has fork()ed into background mode. New PID=5272.
Aug 5 09:14:24 host338728 avahi-daemon[5132]: Service "SFTP File Transfer on host338728" (/services/sftp-ssh.service) successfully established.
Aug 5 09:15:38 host338728 kernel: ata3.00: qc timeout (cmd 0xa0)
Aug 5 09:15:38 host338728 kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Aug 5 09:15:38 host338728 kernel: ata3.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 8 in
Aug 5 09:15:38 host338728 kernel: cdb 25 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Aug 5 09:15:38 host338728 kernel: res 51/20:03:00:00:00/00:00:00:00:00/a0 Emask 0x5 (timeout)
Aug 5 09:15:38 host338728 kernel: ata3.00: status: { DRDY ERR }
Aug 5 09:15:38 host338728 kernel: ata3: hard resetting link
Aug 5 09:15:38 host338728 kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Aug 5 09:15:39 host338728 kernel: ata3.00: revalidation failed (errno=-2)
Aug 5 09:15:39 host338728 kernel: ata3: failed to recover some devices, retrying in 5 secs
Aug 5 09:15:44 host338728 kernel: ata3: hard resetting link
Aug 5 09:15:44 host338728 kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Aug 5 09:15:45 host338728 kernel: ata3.00: revalidation failed (errno=-2)
Aug 5 09:15:45 host338728 kernel: ata3: failed to recover some devices, retrying in 5 secs
Aug 5 09:15:50 host338728 kernel: ata3: hard resetting link
Aug 5 09:15:50 host338728 kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Aug 5 09:15:50 host338728 kernel: ata3.00: revalidation failed (errno=-2)
Aug 5 09:15:51 host338728 kernel: ata3.00: disabled
Aug 5 09:15:51 host338728 kernel: ata3: hard resetting link
Aug 5 09:15:51 host338728 kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Aug 5 09:15:51 host338728 kernel: ata3: EH complete
Aug 5 09:17:30 host338728 ntpd[4561]: synchronized to LOCAL(0), stratum 10
Aug 5 09:17:30 host338728 ntpd[4561]: kernel time sync enabled 0001
Aug 5 09:18:36 host338728 ntpd[4561]: synchronized to 128.10.19.24, stratum 1
----------
Opened configuration file /etc/smartd.conf
Aug 5 09:14:24 host338728 smartd[5270]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
Aug 5 09:14:24 host338728 smartd[5270]: Problem creating device name scan list
Aug 5 09:14:24 host338728 smartd[5270]: Device: /dev/sda, opened
Aug 5 09:14:24 host338728 smartd[5270]: Device: /dev/sda, Bad IEC (SMART) mode page, err=5, skip device
Aug 5 09:14:24 host338728 smartd[5270]: Device: /dev/sdb, opened
Aug 5 09:14:24 host338728 smartd[5270]: Device: /dev/sdb, Bad IEC (SMART) mode page, err=5, skip device
Aug 5 09:14:24 host338728 smartd[5270]: Monitoring 0 ATA and 0 SCSI devices
Aug 5 09:14:24 host338728 smartd[5272]: smartd has fork()ed into background mode. New PID=5272.
Aug 5 09:14:24 host338728 avahi-daemon[5132]: Service "SFTP File Transfer on host338728" (/services/sftp-ssh.service) successfully established.
Aug 5 09:15:38 host338728 kernel: ata3.00: qc timeout (cmd 0xa0)
Aug 5 09:15:38 host338728 kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Aug 5 09:15:38 host338728 kernel: ata3.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 8 in
Aug 5 09:15:38 host338728 kernel: cdb 25 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Aug 5 09:15:38 host338728 kernel: res 51/20:03:00:00:00/00:00:00:00:00/a0 Emask 0x5 (timeout)
Aug 5 09:15:38 host338728 kernel: ata3.00: status: { DRDY ERR }
Aug 5 09:15:38 host338728 kernel: ata3: hard resetting link
Aug 5 09:15:38 host338728 kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Aug 5 09:15:39 host338728 kernel: ata3.00: revalidation failed (errno=-2)
Aug 5 09:15:39 host338728 kernel: ata3: failed to recover some devices, retrying in 5 secs
Aug 5 09:15:44 host338728 kernel: ata3: hard resetting link
Aug 5 09:15:44 host338728 kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Aug 5 09:15:45 host338728 kernel: ata3.00: revalidation failed (errno=-2)
Aug 5 09:15:45 host338728 kernel: ata3: failed to recover some devices, retrying in 5 secs
Aug 5 09:15:50 host338728 kernel: ata3: hard resetting link
Aug 5 09:15:50 host338728 kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Aug 5 09:15:50 host338728 kernel: ata3.00: revalidation failed (errno=-2)
Aug 5 09:15:51 host338728 kernel: ata3.00: disabled
Aug 5 09:15:51 host338728 kernel: ata3: hard resetting link
Aug 5 09:15:51 host338728 kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Aug 5 09:15:51 host338728 kernel: ata3: EH complete
Aug 5 09:17:30 host338728 ntpd[4561]: synchronized to LOCAL(0), stratum 10
Aug 5 09:17:30 host338728 ntpd[4561]: kernel time sync enabled 0001
Aug 5 09:18:36 host338728 ntpd[4561]: synchronized to 128.10.19.24, stratum 1
#3
Posted 05 agosto 2009 - 05:48
Galera. Tem algo errado realmente.
impossível estar certo isso.
demanha rodei um teste pra ver se tinha badblocks.
o teste nao deu pra ver até o final pq deu pau na net.
mas vejam isso.
Os blocks/read sec e write/sec estao altissimos.
vejam:
reads/s: 32800.54
write/s: 3909.08
o que é isso. parece muito anormal.
impossível estar certo isso.
demanha rodei um teste pra ver se tinha badblocks.
o teste nao deu pra ver até o final pq deu pau na net.
mas vejam isso.
Os blocks/read sec e write/sec estao altissimos.
vejam:
reads/s: 32800.54
write/s: 3909.08
o que é isso. parece muito anormal.
#4
Posted 05 agosto 2009 - 05:53
São sintomas de aparente faha no disco.
Informe ao seu DC e peça a troca imediata desses HDs. Eles podem parar totalmente e fazer você perder tudo.
Bons negócios.
Informe ao seu DC e peça a troca imediata desses HDs. Eles podem parar totalmente e fazer você perder tudo.
Bons negócios.
#5
Posted 06 agosto 2009 - 09:50
o suporte disse que esta tudo bem:
Hello,
The array is in optimal stage.
---------------
root@xxx [~]# /usr/StorMan/arcconf getconfig 1 | grep -i state
State : Online
State : Online
State : Online
root@xxx [~]# /usr/StorMan/arcconf getconfig 1 | grep -i status
Controller Status : Optimal
Status : Optimal
Status of logical device : Optimal
Status of logical device : Optimal
---------------
The following parameters of iostat are also not high.
1. The average service time (svctm).
2. Percentage of CPU time during which I/O requests were issued (%util).
3. Reads/second and writes/second (r/s and w/s).
---------------
root@host338728 [~]# iostat -x -d
Linux 2.6.18-128.2.1.el5 (xxx.xxx.com) 08/05/2009
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 3.18 314.81 127.66 281.24 16284.20 4769.17 51.49 3.69 9.02 0.28 11.32
sda1 0.54 0.00 0.07 0.00 8.88 0.00 125.66 0.00 0.82 0.46 0.00
sda2 2.59 314.79 127.59 281.24 16275.22 4768.99 51.47 3.69 9.02 0.28 11.31
sda3 0.04 0.02 0.00 0.00 0.07 0.18 62.82 0.00 7.04 5.10 0.00
sdb 0.00 0.00 0.00 0.00 0.02 0.00 24.47 0.00 1.26 1.24 0.00
---------------
So there is no need to worry about the health of the drives in the array. As always, if you need any further assistance please let us know.
mas nao to crendo muito. Preciso de uma explicacao para o que esta acontecendo.
Hello,
The array is in optimal stage.
---------------
root@xxx [~]# /usr/StorMan/arcconf getconfig 1 | grep -i state
State : Online
State : Online
State : Online
root@xxx [~]# /usr/StorMan/arcconf getconfig 1 | grep -i status
Controller Status : Optimal
Status : Optimal
Status of logical device : Optimal
Status of logical device : Optimal
---------------
The following parameters of iostat are also not high.
1. The average service time (svctm).
2. Percentage of CPU time during which I/O requests were issued (%util).
3. Reads/second and writes/second (r/s and w/s).
---------------
root@host338728 [~]# iostat -x -d
Linux 2.6.18-128.2.1.el5 (xxx.xxx.com) 08/05/2009
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 3.18 314.81 127.66 281.24 16284.20 4769.17 51.49 3.69 9.02 0.28 11.32
sda1 0.54 0.00 0.07 0.00 8.88 0.00 125.66 0.00 0.82 0.46 0.00
sda2 2.59 314.79 127.59 281.24 16275.22 4768.99 51.47 3.69 9.02 0.28 11.31
sda3 0.04 0.02 0.00 0.00 0.07 0.18 62.82 0.00 7.04 5.10 0.00
sdb 0.00 0.00 0.00 0.00 0.02 0.00 24.47 0.00 1.26 1.24 0.00
---------------
So there is no need to worry about the health of the drives in the array. As always, if you need any further assistance please let us know.
mas nao to crendo muito. Preciso de uma explicacao para o que esta acontecendo.
#6
Posted 06 agosto 2009 - 09:14
Uma boa explicação seriam erros como:
"
Aug 5 09:14:24 host338728 smartd[5270]: Device: /dev/sda, Bad IEC (SMART) mode page, err=5, skip device
"
e o fato de o número de leitura/segundo nos HDs estarem absurdamente altas. Isso é indício de dificuldades que o HD está tendo para ler os dados. Ele está precisando fazer muito esforço para conseguir.
"
Aug 5 09:14:24 host338728 smartd[5270]: Device: /dev/sda, Bad IEC (SMART) mode page, err=5, skip device
"
e o fato de o número de leitura/segundo nos HDs estarem absurdamente altas. Isso é indício de dificuldades que o HD está tendo para ler os dados. Ele está precisando fazer muito esforço para conseguir.
#7
Posted 06 agosto 2009 - 10:55
que comando posso executar para checar esse serros e assim, poder esfregar na cara do data center?
#8
Posted 07 agosto 2009 - 12:16
Pode usar os mesmo que você mesmo ja colou:
"/var/log/messsages
----------
Opened configuration file /etc/smartd.conf
(...)
"
Lá mostra os avisos do SMART sobre os erros.
"/var/log/messsages
----------
Opened configuration file /etc/smartd.conf
(...)
"
Lá mostra os avisos do SMART sobre os erros.
#9
Posted 12 agosto 2009 - 01:06
Galera, to até agora com esse problema. Não esta dando mais certo os comandos para testar o disco.
só dá: "A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
"
por favor, digam ai que comandos usam para testar.
só dá: "A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
"
por favor, digam ai que comandos usam para testar.
Share this topic:
Page 1 of 1

Help










