there are currently only 3 pdas working out, the plan is to reach about 40 but we started with 3 to see if everything goes ok. i will check the performance asap, but i dont think that with 3 pdas that is causing the issue to be honest.
in the meanwhile is there anything other we can check ?
the err.log doesnt show anything about the errors produced on the client . on the specific hours that the olsynclog of the client shows the errors there is no entry in the err.log of the server.
this err.log is it olites or oracle db file?
moreover the attempts to sync that return the failure message with the lock are not shown in the performance history syncs of the mobile manager.
it is my guess that the client is throwing the error before it contacts the server.
let me say something more
when the synch happens there are 4 bars, composing sending receiving and processing. the second bar (sending) is filled up BUT the tick in front of the bar is not activated then there is a long unusual waiting period and then the timeout when waiting... is thrown the tick becomes visible and the sync stops. i dont know if that is of any importance.
any other ideas?:(
thank you for all your efforts and time
nope there is nothing relevant to the date and time in the err.log. there is something interesting though on the handheld.
i have set up the debug on the handheld and each time i try to sync it creates 2 debugx.txt files
in the first one i notice that at some point it says:
last write 470 total_out=7759
8:38:23.000 octrDmHttp::sendRaw===>dmHttpWrite... 8:38:24.000 dmHttpWrite(4)=4
8:38:24.000 octrDmHttp::sendRaw===>dmHttpWrite... 8:38:24.000 dmHttpWrite(1024)=1024
m_sentLenInReq=1024 m_totalSentLen=1024 SendPart ret nlen=1024
sendPart(4096) 8:38:24.000 octrDmHttp::sendRaw===>dmHttpWrite... 8:38:24.000 dmHttpWrite(4096)=4096
m_sentLenInReq=5120 m_totalSentLen=5120 SendPart ret nlen=4096
sendPart(2878) 8:38:24.000 octrDmHttp::sendRaw===>dmHttpWrite... 8:38:24.000 dmHttpWrite(2878)=2878
m_sentLenInReq=7998 m_totalSentLen=7998 SendPart ret nlen=2878
8:38:24.000 octrDmHttp::sendRaw===>dmHttpSendRequest... 8:41:14.000 octrDmHttp::sendRaw<===dmHttpSendRequest
8:41:14.000 dmHttpRead... Content-Length=0
TIMEOUT_MAX = -1, socket timeout = 60
dmHttpRead err=10054 WINCE: 10054
8:41:14.000 receive done rec_success=0
AddLog(0 "ERROR",0,"08/19/2010 08:41:14",":10054 ","TZIAKOURIS_NICOLAS" )
8:41:14.000 okConnect1()... 8:41:14.000 okConnect1(\SD Card\conscli.odb)=2633737
CONNECT OKAPI conscli=2633737 okEnv=1a0b624
CreateFile error 2 \Orace\TZIAKOURIS_NICOLASolres.bin
8:41:14.000 CreateFile error 2 TZIAKOURIS_NICOLASolres.bin
8:41:14.000 ret2=0 DoProcess()=-2
ocDoSyncronize done 8:41:14.000 ocEnv=19fc3fc
ROLLBACK OKAPI conscli
DISCONNECT OKAPI conscli okEnv=1a0b624
8:52:47.000 okFinal 0
then the second file created says
Start of debug.txt Jun 16 2010 SP=2b9ef6cc
8:41:16.000 *** InitCCC env=27237160
8:41:16.000 okConnect0()... 8:41:16.000 okConnect0(\SD Card\conscli.odb)=0
Connect nopass okEnv=1a00f74
Error at C:\ADE\omeprod_ol103030\olite\db\build\win\ocapi\..\..\..\src\ocapi\username.cpp line:1453 rc:-3264
Build date Jun 16 2010
okErr=(Timeout when waiting for a lock)
AddLog(-3264 "ERROR",POL-3264,"08/19/2010 08:41:17","Timeout when waiting for a lock:C$INFO","myuser")
ROLLBACK OKAPI conscli
so it appears that first it gets the 10054 error and then the timeout error.
from the olsynclog i see that:
"ERROR",0,"08/19/2010 08:56:26",":10054 ","TZIAKOURIS_NICOLAS"
"ERROR",POL-3264,"08/19/2010 08:56:28","Timeout when waiting for a lock:C$INFO","TZIAKOURIS_NICOLAS"
so the 10054 error occurs first and then the timeout when waiting for a lock.
ok so the question is what is this 10054? what does this code mean and what can i do about it please?
On the server, in the webogo.ora config file, do you have the following set in the [CONSOLIDATOR] section?
The RESUME_CLIENT_TIMEOUT parameter is the number of seconds that the client should use to timeout network operations. The default is 60 seconds.
The RESUME_TIMEOUT parameter indicates how long to keep client data while the client is not connected. The default is 0, which means that resume is disabled and after disconnection, the client data is discarded. A short timeout, such as 15 minutes, is suitable to resume any accidentally dropped connections. A longer timeout may be needed if users explicitly pause and resume synchronization to switch networks or use a dialup connection for another purpose.
Also, if you could get a hold of that ERR.log file, that would be great.
i appreciate all your help guys on this item.
it was very helpfull
at the end after a series of stressed and exhausting tests and after using packet sniffer and network analyzing software and together with the handheld manufacturer it appears that
this issue is caused because when through gprs the handheld cannot calculate the correct packet (mtu) size. the cellular company with the gprs handheld causes this issue.
after creating a new registry value in the handheld's registry that enables mtu automatic discovery the problem appears to be solved. the problem first trew a wince 10054 erro and then the timeout when waiting for a lock. the timeout when... was deceiving and caused the wrong imppresion. the 10054 was the network issue and when solved all went well.
in my opinion if anyof you have any kind of gprs synchronization issues with a customer, the very first thing you have to try is to put it on the cradle on a customer's pc and see if it synchronizes no matter what the message is. if it does then go low with a network analyzer through the gprs.