Theodore Moriello

HOW TO - Cleanup your UPLOAD directory

Discussion created by Theodore Moriello on Jan 2, 2020
Latest reply on Jan 2, 2020 by Theodore Moriello

I've seen this question asked a lot and never seen any answers.  I found a way to reduce the size of my on-site "uploads" folder by almost 30GB.  Thats a lot of recovered file system storage especially if you perform regular backups of your file system.

 

Uploads are mentioned in the developer guide and it was enough to point me in the right direction however there wasn't enough information to complete the task.  

https://support.sugarcrm.com/Documentation/Sugar_Developer/Sugar_Developer_Guide_9.0/Architecture/Uploads/ 

 

Specifically what I was targeting here was "Email Attachments".  Any time an email is sent from Sugar with an attachment, Sugar stores a copy of that file in the uploads directory but renames it to a GUID matching the ID of a note record.  What?!?!  Well if you think about it, it makes sense.  Multiple copies of "MyDoc.pdf" with different content would be impossible for Sugar to identify which is the right "MyDoc.pdf".  They used this mechanism to create a unique ID for all documents in the uploads directory.  They just never gave us an easy way to delete or cull this info.  It just grows and grows.....

 

This process was performed on CentOS 6 on an on-site installation but the process will basically be the same for any other on-site platforms.  There is a query in the developer doc to locate email attachments but it is kind of flawed and just needs a little tweaking.  Here is the query Sugar suggests:

 

SELECT filename, file_mime_type FROM notes
INNER JOIN emails ON notes.parent_type = 'Emails' AND notes.parent_id = emails.id
INNER JOIN emails_text ON emails.id = emails_text.email_id;

That will produce a TON of results.  Some even duplicated.  Slightly change this query's first line
and add another line at the bottom.

SELECT notes.id FROM notes
INNER JOIN emails ON notes.parent_type = 'Emails' AND notes.parent_id = emails.id
INNER JOIN emails_text ON emails.id = emails_text.email_id
WHERE notes.date_entered < '2019-09-01';


What I did here was added the note ID (to find the matching document GUIDs) and set a time frame. In
my case, I wanted to delete all email attachments over 3 months old.  Export this list to CSV which
will give you one column with GUIDS.  Then its a simple formula in Excel.  In Column B:
="/path/to/upload/directory/" & A2  {basically the cell to the left of it}.  Now drag that column down
the whole list, highlight Column B, copy and paste values.  Now delete Column A and you now have a
listing of full path file names to be deleted.  Copy this list to a text file on your Sugar server.

Now for CentOS, RHEL... Lets say I named the file "clearuploads.txt"
sudo xargs rm < clearuploads.txt


Thats it! This will remove the email attachments but leave the note in place. The note record will
still exist in Sugar and it will still have the attachment filename for reference however clicking the
hyperlink just presents a "javascript void(0)" dialog and nothing happens when you click it.

Outcomes