You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
2.8 KiB
2.8 KiB
Common Issues
Backends Not Up
The backends need to be up for search to run. When the backends are up, running npusearch_check will look something like this:
lrl_admin@guava:~$ npusearch_check
[
"npusearch:request:guava-0-1710520177",
"npusearch:request:guava-1-1710520177",
"npusearch:request:guava-10-1710520177",
"npusearch:request:guava-11-1710520177",
"npusearch:request:guava-12-1710520177",
"npusearch:request:guava-13-1710520177",
"npusearch:request:guava-14-1710520177",
"npusearch:request:guava-15-1710520177",
"npusearch:request:guava-2-1710520177",
"npusearch:request:guava-3-1710520177",
"npusearch:request:guava-4-1710520177",
"npusearch:request:guava-5-1710520177",
"npusearch:request:guava-6-1710520177",
"npusearch:request:guava-7-1710520177",
"npusearch:request:guava-8-1710520177",
"npusearch:request:guava-9-1710520177"
]
When the backend are down, you will see this:
lrl_admin@guava:~$ npusearch_check
[]
Here are some steps to try to bring the backends up when they are down:
1. Restart using systemctl
Run sudo systemctl restart npusearch.service. Wait about 15 seconds, then try npusearch_check again.
2. Check status using systemctl
Run sudo systemctl status npusearch.service.
- If the last thing it prints is that it's satisfying a license, it got stuck during the startup process. If your device has SmartSSDs, try
sudo systemctl restart mpd.service. - If it says "Failed to start NPUSearch search backends." and your device has SmartSSDs, try
sudo systemctl restart mpd.service. If your device has Kuona cards, trysudo insmod npusearch. If that errors, trysudo dpkg-reconfigure npusearch.
Repeat step 1.
3. Check log messages
In /opt/lrl/etc/npusearch.conf the line export LOGFILE=path/to/logfile will be where NPUSearch is writing logs. If the line is commented out, un-comment it and set a path for a log file to be written to. Run steps 1 and 2 again and then read the logs to see where the issues may be.
Search performance is lower than expected
This is commonly caused by the SSDs overheating and throttling.
- Double check to make sure the fan speed is turned up to minimum 100% on iDRAC. Use iDRAC to check that the inlet air temperature into the server is not too hot. If anything is changed at this step, run the tests again.
- Inspect the
ssd_nvme_smart_log_data.ndjsonfile to see how hot the SSDs are getting. Each line of that file is anvme smart-logoutput for each SSD at a given timestamp. The thermal test will fail if any SSD reaches 349 Kelvin, but some SSDs will throttle performance before getting that hot. If desired, send thessd_nvme_smart_log_data.ndjsonandnpusearch_install.logfiles to support@lewis-rhodes.com. LRL can do detailed analysis to help determine if throttling is happening.